OpenFreeEnergy / gufe

grand unified free energy by OpenFE
https://gufe.readthedocs.io
MIT License
28 stars 8 forks source link

Add support for capturing instrumentation data in ProtocolUnitResult #222

Open jchodera opened 1 year ago

jchodera commented 1 year ago

In order to help target developer effort to parts of the code where optimization effort will be most useful, it would be helpful to be able to capture timing information from each ProtocolUnit. Right now, we just capture the start/stop times, but capturing information about both what hardware was run on and finer-grained details about how much time is spent in various setup operations and sub-steps (e.g. replica propagation and energy computation) would be very helpful.

Could we add or agree to use a dict variable in ProtocolResult that could be used for this purpose?

In the longer term, it was suggested we could further divide ProtocolUnit into smaller units and rely solely on the start/stop times of each ProtocolUnit, but it seems we are a ways off from this right now.

richardjgowers commented 1 year ago

A tangential thing to this would be a ProtocolUnit.estimated_time_required(<Hardware info>) -> duration which would be a big help in scheduling. For this we'll have to start gathering some data on time elapsed, system size, and hardware to build some heuristics/models on expected performance.