ReliaSolve / cctbx_project

Computational Crystallography Toolbox
https://cctbx.github.io
Other
0 stars 0 forks source link

Check probe2 time running on 3kz4 and on Tom's input file. #276

Closed russell-taylor closed 9 months ago

russell-taylor commented 9 months ago

Check probe2 running on larger input files. 3kz4 is one, and the file that Tom gave us that caused original Reduce to crash is an even larger one built on it. 3kz4 is taking many minutes. Run on the Reduce2 output file so that it will have hydrogens.

russell-taylor commented 9 months ago

(done) The bottleneck is in the N^2 loops starting at line 1989, where we check whether the source and target atoms are in the current atom list. We can speed this up by making a list or dictionary that records whether each is available, perhaps catching exceptions when we look up one that is not in the structure.

Converting the list of atoms into a set to make the "in" test faster resulted in a total run-time reduction from >50 minutes to >4 minutes (taking essentially no time in the bottleneck), with the summary results being the same.