choderalab / perses

Experiments with expanded ensembles to explore chemical space
http://perses.readthedocs.io
MIT License
181 stars 51 forks source link

Migrate solute-only trajectory writing from NetCDF to XTC #1180

Open jchodera opened 1 year ago

jchodera commented 1 year ago

Typical calculations generate NetCDF files that take up way too much space.

For example, a typical SARS-CoV-2 Mpro calculation will generate a positions variable in the complex NetCDF file that consumes (18 replicas) (5000 iterations/replica) (10000 atoms) (3 dimensions/atom) (4 bytes/dimension) = 10GB of data.

We should migrate the solute-only trajectories to a much more data-efficient XTC format if we request they be stored, since this could reduce storage sizes by 10x.