OpenFreeEnergy / alchemiscale-fah

protocols and compute service for using alchemiscale with Folding@Home
MIT License
2 stars 0 forks source link

Default integrator settings for `FahNonEquilibriumCyclingProtocol` too short for FAH WUs of reasonable runtime #13

Open dotsdl opened 2 months ago

dotsdl commented 2 months ago

Following initial testing with FAH volunteers, it became clear that the size of our individual work units (WUs) were really short, completing within about 100s in many cases. Such short WUs give poor amortization of transfer of data to and from a volunteer host, can cause excessive thrashing on the assignment server(s) (AS) and work server (WS), and can then result in a large proliferation of WUs spanning many CLONEs for a given RUN.

Our use of the FahNonEquilibriumCyclingProtocol means that a WU is created for each FahSimulationUnit in the ProtocolDAG, and a FahSimulationUnit is created for each cycle specified by the count num_cycles in FahNonEquilibriumCyclingSettings. The only way to expand the length of a WU then is to increase the number of steps taken in the equilibrium and nonequilibrium legs of the cycle, but it is unclear if this meaningfully increases the sampling of nonequilibrium work.

By default, the settings for this protocol give:

    equilibrium_steps: int = 12500
    nonequilibrium_steps: int = 12500

For our tests with the FAH volunteers, we doubled these, to 25000 each. These were performed with the default 4fs timestep.

@jchodera and @ijpulidos, thoughts on this? What do you think is most reasonable for us to do here?

ijpulidos commented 2 months ago

As discussed in the meeting, I believe that the sampling advantage for this protocol should come more from running more cycles, instead of adding steps to the eq/neq parts. That said, I think it's still valuable to check if for charge-changing running more steps could make sense. My two cents.

jchodera commented 2 months ago

Apologies for the late response here.

For the COVID Moonshot, our nonequilibrium cycling protocols were 4 ns in length (which is still rather short):

For FAH, I suggest we start with this and then explore halving or doubling the simulation lengths (while still testing without points) for the same transformations to explore robustness and accuracy. We should do many independent cycles (e.g. at least 200) to have decent statistics. For 4 fs timesteps with HMR, this is:

    equilibrium_steps: int = 250000
    nonequilibrium_steps: int = 250000