This is THE main PR that I was working on. I deleted the version control switch, so it only uses the most recent version of the code. Although I tried to rebase on the most recent main and my cleanup PR (to delete unused variables), I may have made a mistake. Please let me know if you see anything out of order.
Here are the list of main changes:
compute_moments is ported to Rust. As discussed with Rosy, the Rust version takes inverse and determinant computed by numpy for numerical stability reasons. This increases the runtime significantly compared to pure Rust version (4 ~ 10 times slower), but causes less error compared to the original Python version (maximum observed relative error was ~5e-9). Also, the "slower" version is still faster than the original implementation by 6 ~ 45 times on my local machine. I will look into the determinant computation and try to figure out the source of error. I will also run scripts on CHTC to get more objective runtime comparisons.
ClebschGordanReal now uses new CGRCacheList. It uses CLOCK algorithm (originally used for page replacement) to store CG matrices of 5 "most recently used" l_max values [*]. This has an unfortunate(?) side-effect of limiting l_max value to 2^31 - 1, but it should not be a problem as we expect l_max to be much smaller than the maximum value. On my local machine, it divides the number of calls to the constructor by the number of iterations ran with the same l_max. For example, when I ran tests on local machine that performs 5 iterations with l_max of 10, the number of calls to the constructor went from 3355 to 671 -- a decrease by factor of 5. As such, runtime also decreases by the factor of number of iterations. Note, the cache currently has a capacity of 5. This is an arbitrary decision, and you can increase it later. However, do note that doing so may increase memory usage, especially when it caches matrices for multiple large l_max. In the future, I will try to run some tests on CHTC to determine average runtime improvement caused by caching.
Looping structure inside the pairwise_ellip_expansion function has changed. Before, it was looping through every possible pair of species and checking if it is in the neighbor_list.keys. However, I changed so that I directly loop through the elements in neighbor_list.keys. As such, pairwise_ellip_expansion no longer requires species as one of its arguments.
I have added a new test_compute_moments.py to compare the moment generation between Rust version and Python version. There are two tests: tests with set parameters and a randomized test. As mentioned in (1), the maximum error observed so far was ~5e-9, so both of my tests uses assert numpy.allclose(x, y, 1e-8) to test for closeness. I have ran several iterations and it has not failed so far.
I know it is a big PR, so let me know if you have any questions and/or concerns.
[*]: If you search the CLOCK algorithm, it mimics the "least recently used" replacement policy. However, the algorithm is not exact and it has a chance to replace one of the more recently used entries.
This is THE main PR that I was working on. I deleted the version control switch, so it only uses the most recent version of the code. Although I tried to rebase on the most recent main and my cleanup PR (to delete unused variables), I may have made a mistake. Please let me know if you see anything out of order.
Here are the list of main changes:
compute_moments
is ported to Rust. As discussed with Rosy, the Rust version takes inverse and determinant computed bynumpy
for numerical stability reasons. This increases the runtime significantly compared to pure Rust version (4 ~ 10 times slower), but causes less error compared to the original Python version (maximum observed relative error was ~5e-9
). Also, the "slower" version is still faster than the original implementation by 6 ~ 45 times on my local machine. I will look into the determinant computation and try to figure out the source of error. I will also run scripts on CHTC to get more objective runtime comparisons.ClebschGordanReal
now uses newCGRCacheList
. It uses CLOCK algorithm (originally used for page replacement) to store CG matrices of 5 "most recently used"l_max
values [*]. This has an unfortunate(?) side-effect of limitingl_max
value to2^31 - 1
, but it should not be a problem as we expectl_max
to be much smaller than the maximum value. On my local machine, it divides the number of calls to the constructor by the number of iterations ran with the samel_max
. For example, when I ran tests on local machine that performs 5 iterations withl_max
of 10, the number of calls to the constructor went from 3355 to 671 -- a decrease by factor of 5. As such, runtime also decreases by the factor of number of iterations. Note, the cache currently has a capacity of 5. This is an arbitrary decision, and you can increase it later. However, do note that doing so may increase memory usage, especially when it caches matrices for multiple largel_max
. In the future, I will try to run some tests on CHTC to determine average runtime improvement caused by caching.pairwise_ellip_expansion
function has changed. Before, it was looping through every possible pair of species and checking if it is in theneighbor_list.keys
. However, I changed so that I directly loop through the elements inneighbor_list.keys
. As such,pairwise_ellip_expansion
no longer requiresspecies
as one of its arguments.test_compute_moments.py
to compare the moment generation between Rust version and Python version. There are two tests: tests with set parameters and a randomized test. As mentioned in (1), the maximum error observed so far was ~5e-9
, so both of my tests usesassert numpy.allclose(x, y, 1e-8)
to test for closeness. I have ran several iterations and it has not failed so far.I know it is a big PR, so let me know if you have any questions and/or concerns.
[*]: If you search the CLOCK algorithm, it mimics the "least recently used" replacement policy. However, the algorithm is not exact and it has a chance to replace one of the more recently used entries.