Figured I should start a thread of what the pseudocode might look like for this repo, and what sort of user facing stuff might look like.
Here's a quick 5-minute attempt at what the pipeline would look like. I figure this thread would serve as a way to discuss and update this as we go:
# Import necessary modules from PALE
from pale import ProteinSystem, MutationEngine, HybridTopologyFactory, FreeEnergyCalculator
from pale import SimulationParameters, SamplingMethod
# Load the original protein structure
original_protein = ProteinSystem.load('path/to/protein.pdb')
# Maybe we do some sort of quality assessment here
# PDBFixer would probably be useful for something like this
list_of_missing_atoms = original_protein.check_for_missing_atoms()
original_protein.add_missing_atoms(list_of_missing_atoms)
# If we assume everything is protonated by the user prior to input, we should likely check to vierfy
original_protein.verify_hydrogens_present()
# Define the mutations to apply
# Each single point mutation is specified by chain ID, residue ID, and the new amino acid
list_of_mutations = [
{'chain_id': 'A', 'residue_index': 100, 'new_residue': 'ALA'},
{'chain_id': 'B', 'residue_index': 150, 'new_residue': 'GLY'},
# Add more mutations as needed
]
# Maybe verify that mappings can be made for each mutation?
# Suggest alternatives/error out of they aren't?
mutation_mapper = MutationEngine(list_of_mutations, mapper='lomap')
mutation_mapper.validate() # this method might print out the issues?
mutation_mapper.create_alternatives()
# Initialize the mutation engine with the original protein for a single mutation
htf_mutation_single = HybridTopologyFactory(original_protein, list of mutations[0])
# if there are multiple mutations, then generate a bunch of HTFs
htf_mutations_list = HybridTopologyFactory.apply_mutations(mutation_mapper)
# Optionally save the hybrid topology factory to disk for later use
htf_mutation_single.save('path/to/hybrid_factory.xml.bz2')
# At this point, the user may choose to stop and use the saved hybrid factory later
# If the user wants to proceed to sampling, they can load the saved hybrid factory or use the one in memory
# Define simulation parameters
simulation_parameters = SimulationParameters(
force_field='amber99sb-ildn',
water_model='tip3p',
temperature=300, # in Kelvin
pressure=1.0, # in atm
simulation_time=100, # in nanoseconds
solvent_padding=10, # in Angstroms
# Additional parameters can be added here
)
# validate that an openmm system can be created with these simualtion parameters
# probably easy to just build in vacuum
htf_test_system = HybridTopologyFactory.test_if_valid(simulation_parameters)
# Create a solvated system
htf_system = HybridTopologyFactory.create_solvated_htf(simulation_parameters)
# Choose the integrator method (NEQ or Replica Exchange)
# Option 1: Non-Equilibrium Cycling (NEQ)
neq_sampling = SamplingMethod.neq(
neq_switches=1, # 1 forward and reverse transformation
switching_time=1500, # in picoseconds
time_step=0.002, # in picoseconds
hydrogen_mass_amu = 3.5,
)
# Option 2: Replica Exchange (RE)
repex_sampling = SamplingMethod.repex(
replicas=16, # this should probably be defined by # of lambdas
exchange_attempts=1000,
exchange_frequency=1000,
time_step=0.002
)
# Select the desired sampling method
sampling_method = neq_sampling # For NEQ sampling
# sampling_method = repex_sampling # For Replica Exchange sampling
# Set up the free energy calculator with the sampling method and hybrid factory
fe_calculator = FreeEnergyCalculator(
hybrid_factory=htf_system,
sampling_method=sampling_method
)
# Run the free energy calculation to estimate ∆∆G
delta_delta_G = fe_calculator.compute_ddg()
# Output the results
print(f"Estimated ∆∆G for the specified mutations: {delta_delta_G:.2f} kcal/mol")
This is just attempt v0.01 for what this might look like and what assumptions may go into it! Absolutely should edit/clarify things as we go!
Figured I should start a thread of what the pseudocode might look like for this repo, and what sort of user facing stuff might look like.
Here's a quick 5-minute attempt at what the pipeline would look like. I figure this thread would serve as a way to discuss and update this as we go:
This is just attempt v0.01 for what this might look like and what assumptions may go into it! Absolutely should edit/clarify things as we go!