manodeep / Corrfunc

⚡️⚡️⚡️Blazing fast correlation functions on the CPU.
https://corrfunc.readthedocs.io
MIT License
163 stars 50 forks source link

Reduce gridlink memory footprint #186

Closed lgarrison closed 5 years ago

lgarrison commented 5 years ago

Gridlink has a rather large memory footprint after the min_sep_opt changes. One way to reduce this is by slimming down the cell_pair strut to hold only pair info, and not cell info (#185). But for autocorrelations, there's another factor-of-2 savings possible from the fact that we're presently allocating enough memory to double-count every cell pair, even though we only singly count them. This PR implements that memory saving for autocorrelations.

@manodeep: I think it should be obvious pretty quickly if I messed up the max_num_cell_pairs calculation, since we have an explicit num_cell_pairs < max_num_cell_pairs check. Let's see what Travis says (local tests are passing).

Also feel free to tack on the cell_pair slimming optimizations to this PR.

manodeep commented 5 years ago

@lgarrison Thanks for that version bump - I thought I was missing one version update somewhere.

I am running INTEGRATION_TESTS on the local supercomputer - should take about 48 hours or so to complete. Assuming those pass, we can merge the PR

manodeep commented 5 years ago

@lgarrison The INTEGRATION_TESTS have passed. Should we just merge this PR and make a new release?

lgarrison commented 5 years ago

Yes, let's do it!

manodeep commented 5 years ago

Released on both github and PyPI. Should we delete the gridlink-memory.. branch?

lgarrison commented 5 years ago

Yep! I just left it in case you had something left to add.