Faster IDPP - Githubissues

duartegroup / autodE

automated reaction profile generation

https://duartegroup.github.io/autodE/

MIT License

173 stars 52 forks source link

Faster IDPP #302

Closed shoubhikraj closed 1 year ago

shoubhikraj commented 1 year ago

Remove parallelisation from IDPP. The overhead from creating new processes is larger than any gains from parallelisation.

Test for diels-alder with 200 IDPP images

MacOS - with parallelisation = 4.621 s, without parallelisation = 2.403 s
Windows - with parallelisation = 10.26 s, without parallelisation = 2.331 s

Checklist

[ ] The changes include an associated explanation of how/why
[ ] Test pass
[ ] Documentation has been updated
[ ] Changelog has been updated

codecov[bot] commented 1 year ago

Codecov Report

Merging #302 (9f66730) into v1.4.1 (db7b03d) will increase coverage by 0.04%. Report is 3 commits behind head on v1.4.1. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           v1.4.1     #302      +/-   ##
==========================================
+ Coverage   97.33%   97.37%   +0.04%     
==========================================
  Files         206      206              
  Lines       22957    23048      +91     
==========================================
+ Hits        22345    22443      +98     
+ Misses        612      605       -7

Flag	Coverage Δ
unittests	`97.37% <100.00%> (+0.04%)`	:arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Files	Coverage Δ
autode/neb/original.py	`97.53% <100.00%> (+1.43%)`	:arrow_up:

... and 145 files with indirect coverage changes

t-young31 commented 1 year ago

I'm not convinced this gain is worth the additional code complexity. When a DFT calculation is O(minutes-hours) I'm not sure it makes sense to save O(seconds). Please convince me otherwise!

shoubhikraj commented 1 year ago

@t-young31 That is true, but now the IDPP is being used for i-EIP (and I am writing code for freezing-string method where IDPP is called every iteration). The problem is more obvious on windows (updated timings now) where processes are created, so the overhead is even higher. For POSIX, forking is cheaper so the overhead is less. I honestly don't think it increases the complexity of code that much. It just moves one part of the code outside parallel loop. What do you think?

t-young31 commented 1 year ago

Agree – for testing speed and if called every iteration then it's worth the small amount more complexity