Closed JenkeScheen closed 1 year ago
(copying in some of @jchodera's notes):
It’s not obvious what object is the culprit:
$ ncdump -h out-solvent.nc netcdf out-solvent { dimensions: scalar = 1 ; iteration = UNLIMITED ; // (2536 currently) spatial = 3 ; analysis_particles = 31 ; fixedL55052 = 55052 ; fixedL1052 = 1052 ; fixedL1047 = 1047 ; fixedL1040 = 1040 ; fixedL1045 = 1045 ; fixedL1042 = 1042 ; fixedL935 = 935 ; fixedL24849 = 24849 ; fixedL25035 = 25035 ; fixedL3 = 3 ; atom = 31 ; replica = 18 ; state = 18 ; unsampled = 2 ; variables: int64 last_iteration(scalar) ; int64 analysis_particle_indices(analysis_particles) ; analysis_particle_indices:long_name = "analysis_particle_indices[analysis_particles] is the indices of the particles with extra information stored about them in theanalysis file." ; string options(scalar) ; char metadata(fixedL3) ; float positions(iteration, replica, atom, spatial) ; positions:units = "nm" ; positions:long_name = "positions[iteration][replica][atom][spatial] is position of coordinate \'spatial\' of atom \'atom\' from replica \'replica\' for iteration \'iteration\'." ; float velocities(iteration, replica, atom, spatial) ; velocities:units = "nm / ps" ; velocities:long_name = "velocities[iteration][replica][atom][spatial] is velocity of coordinate \'spatial\' of atom \'atom\' from replica \'replica\' for iteration \'iteration\'." ; float box_vectors(iteration, replica, spatial, spatial) ; box_vectors:units = "nm" ; box_vectors:long_name = "box_vectors[iteration][replica][i][j] is dimension j of box vector i for replica \'replica\' from iteration \'iteration-1\'." ; double volumes(iteration, replica) ; volumes:units = "nm**3" ; volumes:long_name = "volume[iteration][replica] is the box volume for replica \'replica\' from iteration \'iteration-1\'." ; int states(iteration, replica) ; states:units = "none" ; states:long_name = "states[iteration][replica] is the thermodynamic state index (0..n_states-1) of replica \'replica\' of iteration \'iteration\'." ; double energies(iteration, replica, state) ; energies:units = "kT" ; energies:long_name = "energies[iteration][replica][state] is the reduced (unitless) energy of replica \'replica\' from iteration \'iteration\' evaluated at the thermodynamic state \'state\'." ; byte neighborhoods(iteration, replica, state) ; neighborhoods:_FillValue = 1b ; neighborhoods:long_name = "neighborhoods[iteration][replica][state] is 1 if this energy was computed during this iteration." ; double unsampled_energies(iteration, replica, unsampled) ; unsampled_energies:units = "kT" ; unsampled_energies:long_name = "unsampled_energies[iteration][replica][state] is the reduced (unitless) energy of replica \'replica\' from iteration \'iteration\' evaluated at unsampled thermodynamic state \'state\'." ; int accepted(iteration, state, state) ; accepted:units = "none" ; accepted:long_name = "accepted[iteration][i][j] is the number of proposed transitions between states i and j from iteration \'iteration-1\'." ; int proposed(iteration, state, state) ; proposed:units = "none" ; proposed:long_name = "proposed[iteration][i][j] is the number of proposed transitions between states i and j from iteration \'iteration-1\'." ; string timestamp(iteration) ;
// global attributes: :UUID = "9768ef57-c781-4aa3-bb6b-6c25f9de3a06" ; :application = "YANK" ; :program = "yank.py" ; :programVersion = "0.21.5" ; :Conventions = "ReplicaExchange" ; :ConventionVersion = "0.2" ; :DataUsedFor = "analysis" ; :CheckpointInterval = 250LL ; :title = "Replica-exchange sampler simulation created using ReplicaExchangeSampler class of openmmtools.multistate on Fri Mar 17 10:56:53 2023" ;
group: thermodynamic_states { variables: char state0(fixedL55052) ; char state1(fixedL1052) ; char state2(fixedL1047) ; char state3(fixedL1040) ; char state4(fixedL1047) ; char state5(fixedL1045) ; char state6(fixedL1040) ; char state7(fixedL1040) ; char state8(fixedL1045) ; char state9(fixedL1042) ; char state10(fixedL1042) ; char state11(fixedL1040) ; char state12(fixedL1040) ; char state13(fixedL1040) ; char state14(fixedL1040) ; char state15(fixedL1040) ; char state16(fixedL1040) ; char state17(fixedL935) ; } // group thermodynamic_states
group: unsampled_states { variables: char state0(fixedL24849) ; char state1(fixedL25035) ; } // group unsampled_states
group: mcmc_moves { variables: string move0(scalar) ; string move1(scalar) ; string move2(scalar) ; string move3(scalar) ; string move4(scalar) ; string move5(scalar) ; string move6(scalar) ; string move7(scalar) ; string move8(scalar) ; string move9(scalar) ; string move10(scalar) ; string move11(scalar) ; string move12(scalar) ; string move13(scalar) ; string move14(scalar) ; string move15(scalar) ; string move16(scalar) ; string move17(scalar) ; } // group mcmc_moves
group: online_analysis { dimensions: dim_size18 = 18 ; dim_size2 = 2 ; dim_size20 = 20 ; variables: double f_k(dim_size18) ; double free_energy(dim_size2) ; double f_k_history(iteration, dim_size18) ; double free_energy_history(iteration, dim_size2) ; double f_k_offline(dim_size20) ; double f_k_offline_history(iteration, dim_size20) ; } // group online_analysis }
>We might need to figure out whether `ncdump` or another tool can better inspect the contents. `ncdump` can also serialize to other formats---might need to do that and inspect whether there is a big blob there.
>It’s also possible it could be something in the `MultiStateReporter` that automatically sets the chunk size, and ends up making the file huge for no reason.
There’s stuff in there that sets `chunksize` to enlarge the NetCDF file to leave room whenever it grows a variable in the extensible dimension, e.g.
https://github.com/choderalab/openmmtools/blob/main/openmmtools/multistate/multistatereporter.py#L1358
>It’s possible we are setting this way too high for something in the solvent file and it’s literally expanding the NetCDF file with blank data.
>The chunksizes are supposed to be set just large enough to not incur a performance hit in reading and writing by continually having to resize the NetCDF file and add more “continued over here” sections
looking into the edges for this run a bit more closely this seems to have been a one-off behaviour, see /ifs/rtsia01/chodera/asap/asap/project_support/2023/03_jguven/perses/perses
.
When running perses on a protein-ligand system we've noticed that the
out-solvent.nc
is larger thanout-complex.nc
:This seems like a bug which when resolved could save a lot of storage