GilsonLabUCSD / pAPRika

Advanced toolkit for binding free energy calculations
BSD 3-Clause "New" or "Revised" License
30 stars 14 forks source link

Quality of life time estimation #171

Closed jaketanderson closed 1 week ago

jaketanderson commented 2 years ago

As far as I know, pAPRika doesn't provide user-friendly progress bars or time estimates. Would these be possible and/or worth looking into? Maybe the time per window can be recorded and averaged over time to predict the amount of time remaining for future windows. There might also be an attribute time_remaining or estimated_duration of the Simulation class that could store the estimated time remaining and change as the running averages of time per window change.

I could see an issue with different types of windows (e.g. thermalization and pull) taking different amounts of time on average, so separate running averages could be collected per type and applied to the number of remaining windows of that same type. I also am not sure how this would take into account different windows being run at the same time, as in parallel one couldn't simply sum the times for each remaining window. Maybe a fix for that would be taking into account average number of windows running simultaneously?

Feel free to let me know if this is something worth implementing or if it's already implemented and I haven't realized it.

jeff231li commented 2 years ago

@jaketanderson currently, we do not have a progress bar implemented in pAPRika. I think if we would want to implement such a feature, the best way will be to use libraries like ray or dask. They have a built-in progress bar and takes care of parallel tasks. However, I think this will take some work on refactoring the simulation modules. This will probably be implemented in the future.

slochower commented 2 years ago

Yes, interesting. Agree with both of you. @jaketanderson are you asking specifically for during the simulation phase or the analysis phase?

jaketanderson commented 2 years ago

Yes, interesting. Agree with both of you. @jaketanderson are you asking specifically for during the simulation phase or the analysis phase?

I only had in mind the simulation phase. Isn't analysis very short compared to simulation? I'm not really sure it would be useful if analysis takes only a few minutes, but if the libraries Jeff suggested are eventually implemented then I guess analysis progress bars could be added without headache.

jeff231li commented 2 years ago

I only had in mind the simulation phase. Isn't analysis very short compared to simulation? I'm not really sure it would be useful if analysis takes only a few minutes, but if the libraries Jeff suggested are eventually implemented then I guess analysis progress bars could be added without headache.

The analysis can take a few hours if you are analyzing large trajectories with explicit solvent (somewhere around 30-50 ns per window saving snapshots every 1-5 ps)

j-wags commented 2 years ago

I'm a big fan of tqdm for simple progress bars that show progress of a for loop (though this may not apply to most of pAPRika's work).

import tqdm, time
for i in tqdm.tqdm(range(10)):
    time.sleep(1)
100%|███████████████████████████████████████████████| 10/10 [00:10<00:00,  1.00s/it]

A newer OpenFF program, bespokefit, uses rich.track for progress bars. Though this also seems to just wrap a for loop. Bespokefit has a really beautiful and informative command line outputs while it runs - You might consider cribbing other tricks from it too :-) https://openff-bespokefit.readthedocs.io/en/latest/getting-started/quick-start.html

slochower commented 2 years ago

Great points. I think it would not be too difficult to add progress bars to analysis following what bespokefit does (thanks @j-wags) or tqdm. (Would you be up for trying something @jaketanderson ?) For the simulation side of things... seems tricky to me. We'd have to poll to see how many jobs are finished or how many frames are written so far, I think. Maybe we could make it work for one queuing system but probably hard to be portable. I'm certainly open to any ideas or PRs.

Nice examples here under "Progress Bars": https://github.com/Textualize/rich

jaketanderson commented 2 years ago

(Would you be up for trying something @jaketanderson ?)

Sure, I'm on spring break right now so I'll mess around with it. Fair warning that I am somewhat slow at programming, so if anyone beats me to an implementation feel free to make a PR from yours :)