choderalab / perses

Experiments with expanded ensembles to explore chemical space
http://perses.readthedocs.io
MIT License
181 stars 51 forks source link

Overhaul perses writing of topology information #1154

Closed jchodera closed 1 year ago

jchodera commented 1 year ago

Currently, perses generates the following topology information on setup:

Additionally, we have two other issues:

I propose we restructure this so we have:

models/ : organize all PDB files here
  {complex,solvent}-{old,new}.pdb : all atoms 
  {complex,solvent}-solute-{old,new}.pdb : solute atoms 

and ensure the atoms in the NetCDF trajectories (checkpoint, standard) are written in the same order as the atoms in the PDB files. Ideally, we could later write replica trajectories as XTC files directly instead of using the NetCDF file, though extracting coordinates doesn't take a huge amount of time.

We can do this for the new Protocol version, where we hopefully have a way to package these files in a more sane way.

ijpulidos commented 1 year ago

Currently the serialized .pdb files are for the old system. We could overhaul the serialization as you mention but I think we could have clashes with the solvent when serializing a solvated version of the new systems (for both complex and solvent phases). Should we re-solvate the new systems? I don't know if that defeats the purpose of serializing these objects.

Maybe we only need the solute versions for the new systems? And keep both the solvated and solute for the old ones.

ijpulidos commented 1 year ago

This should be solved via #1210