choderalab / pale

The PALE: The Protein Alchemical Landscape Explorer
MIT License
2 stars 1 forks source link

Pseudocode API discussion #1

Open sukritsingh opened 4 weeks ago

sukritsingh commented 4 weeks ago

Figured I should start a thread of what the pseudocode might look like for this repo, and what sort of user facing stuff might look like.

Here's a quick 5-minute attempt at what the pipeline would look like. I figure this thread would serve as a way to discuss and update this as we go:

# Import necessary modules from PALE
from pale import ProteinSystem, MutationEngine, HybridTopologyFactory, FreeEnergyCalculator
from pale import SimulationParameters, SamplingMethod

# Load the original protein structure
original_protein = ProteinSystem.load('path/to/protein.pdb')

# Maybe we do some sort of quality assessment here
# PDBFixer would probably be useful for something like this
list_of_missing_atoms = original_protein.check_for_missing_atoms()
original_protein.add_missing_atoms(list_of_missing_atoms)

# If we assume everything is protonated by the user prior to input, we should likely check to vierfy
original_protein.verify_hydrogens_present()

# Define the mutations to apply
# Each single point mutation is specified by chain ID, residue ID, and the new amino acid
list_of_mutations = [
    {'chain_id': 'A', 'residue_index': 100, 'new_residue': 'ALA'},
    {'chain_id': 'B', 'residue_index': 150, 'new_residue': 'GLY'},
    # Add more mutations as needed
]

# Maybe verify that mappings can be made for each mutation?
# Suggest alternatives/error out of they aren't?
mutation_mapper = MutationEngine(list_of_mutations, mapper='lomap')
mutation_mapper.validate() # this method might print out the issues?
mutation_mapper.create_alternatives()

# Initialize the mutation engine with the original protein for a single mutation
htf_mutation_single = HybridTopologyFactory(original_protein, list of mutations[0])

# if there are multiple mutations, then generate a bunch of HTFs
htf_mutations_list = HybridTopologyFactory.apply_mutations(mutation_mapper)

# Optionally save the hybrid topology factory to disk for later use
htf_mutation_single.save('path/to/hybrid_factory.xml.bz2')

# At this point, the user may choose to stop and use the saved hybrid factory later
# If the user wants to proceed to sampling, they can load the saved hybrid factory or use the one in memory

# Define simulation parameters
simulation_parameters = SimulationParameters(
    force_field='amber99sb-ildn',
    water_model='tip3p',
    temperature=300,     # in Kelvin
    pressure=1.0,        # in atm
    simulation_time=100, # in nanoseconds
    solvent_padding=10,  # in Angstroms
    # Additional parameters can be added here
)

# validate that an openmm system can be created with these simualtion parameters 
# probably easy to just build in vacuum
htf_test_system = HybridTopologyFactory.test_if_valid(simulation_parameters)

# Create a solvated system
htf_system = HybridTopologyFactory.create_solvated_htf(simulation_parameters)

# Choose the integrator method (NEQ or Replica Exchange)
# Option 1: Non-Equilibrium Cycling (NEQ)
neq_sampling = SamplingMethod.neq(
    neq_switches=1, # 1 forward and reverse transformation
    switching_time=1500, # in picoseconds
    time_step=0.002, # in picoseconds
    hydrogen_mass_amu = 3.5,

)

# Option 2: Replica Exchange (RE)
repex_sampling = SamplingMethod.repex(
    replicas=16, # this should probably be defined by # of lambdas
    exchange_attempts=1000,
    exchange_frequency=1000,
    time_step=0.002
)

# Select the desired sampling method
sampling_method = neq_sampling  # For NEQ sampling
# sampling_method = repex_sampling  # For Replica Exchange sampling

# Set up the free energy calculator with the sampling method and hybrid factory
fe_calculator = FreeEnergyCalculator(
    hybrid_factory=htf_system,
    sampling_method=sampling_method
)

# Run the free energy calculation to estimate ∆∆G
delta_delta_G = fe_calculator.compute_ddg()

# Output the results
print(f"Estimated ∆∆G for the specified mutations: {delta_delta_G:.2f} kcal/mol")

This is just attempt v0.01 for what this might look like and what assumptions may go into it! Absolutely should edit/clarify things as we go!

ijpulidos commented 3 weeks ago

This is great, thanks! I'll have to sit and digest this and hopefully start creating code in this repo that would match it.