manodeep / Corrfunc

⚡️⚡️⚡️Blazing fast correlation functions on the CPU.
https://corrfunc.readthedocs.io
MIT License
163 stars 50 forks source link

AVX512F kernels, min separation optimizations, not duplicating particle positions, array of cell-pairs #173

Closed manodeep closed 5 years ago

manodeep commented 5 years ago

Also contains the AVX512F kernels.

Needs to be verified by valgrind before merging.

astropy-bot[bot] commented 5 years ago

Hi there @manodeep :wave: - thanks for the pull request! I'm just a friendly :robot: that checks for issues related to the changelog and making sure that this pull request is milestoned and labeled correctly. This is mainly intended for the maintainers, so if you are not a maintainer you can ignore this, and a maintainer will let you know if any action is required on your part :smiley:.

Everything looks good from my point of view! :+1:

If there are any issues with this message, please report them here.

lgarrison commented 5 years ago

Travis is passing tests, but integration tests reveal that the min-sep optimizations are dropping a particle when linking in RA.

manodeep commented 5 years ago

@lgarrison Thanks for the fix on the weights. I can finally now debug the DDtheta min-sep-opt

pep8speaks commented 5 years ago

Hello @manodeep! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 33:19: E231 missing whitespace after ',' Line 86:37: E127 continuation line over-indented for visual indent Line 102:29: E128 continuation line under-indented for visual indent

Line 32:19: E231 missing whitespace after ',' Line 53:80: E501 line too long (80 > 79 characters)

Line 47:25: W605 invalid escape sequence '\p' Line 207:80: E501 line too long (80 > 79 characters) Line 232:26: W605 invalid escape sequence '\p' Line 344:80: E501 line too long (93 > 79 characters) Line 349:80: E501 line too long (88 > 79 characters)

Line 234:62: W605 invalid escape sequence '\m' Line 251:80: E501 line too long (84 > 79 characters) Line 277:80: E501 line too long (93 > 79 characters) Line 282:80: E501 line too long (88 > 79 characters)

Line 292:80: E501 line too long (93 > 79 characters)

Line 213:80: E501 line too long (91 > 79 characters) Line 218:80: E501 line too long (80 > 79 characters)

Line 35:47: W605 invalid escape sequence '\p' Line 38:26: W605 invalid escape sequence '\p' Line 163:74: W605 invalid escape sequence '\p' Line 264:80: E501 line too long (91 > 79 characters) Line 269:80: E501 line too long (80 > 79 characters)

Line 36:45: W605 invalid escape sequence '\m' Line 38:38: W605 invalid escape sequence '\m' Line 280:80: E501 line too long (91 > 79 characters) Line 285:80: E501 line too long (80 > 79 characters)

Line 488:80: E501 line too long (84 > 79 characters)

Line 199:80: E501 line too long (84 > 79 characters)

Line 916:80: E501 line too long (92 > 79 characters) Line 916:92: E502 the backslash is redundant between brackets Line 917:23: E128 continuation line under-indented for visual indent Line 976:80: E501 line too long (83 > 79 characters) Line 986:15: E231 missing whitespace after ',' Line 1014:80: E501 line too long (122 > 79 characters) Line 1017:46: E231 missing whitespace after ',' Line 1020:46: E231 missing whitespace after ',' Line 1054:5: E722 do not use bare 'except'

Line 69:32: E231 missing whitespace after ','

Line 372:1: W293 blank line contains whitespace Line 372:1: W391 blank line at end of file

Line 559:1: W293 blank line contains whitespace Line 559:1: W391 blank line at end of file

Line 450:5: E265 block comment should start with '# ' Line 451:5: E265 block comment should start with '# ' Line 452:5: E265 block comment should start with '# ' Line 454:5: E265 block comment should start with '# ' Line 484:80: E501 line too long (111 > 79 characters) Line 487:80: E501 line too long (110 > 79 characters) Line 533:80: E501 line too long (82 > 79 characters) Line 534:80: E501 line too long (86 > 79 characters) Line 535:80: E501 line too long (85 > 79 characters) Line 543:1: W293 blank line contains whitespace Line 544:80: E501 line too long (83 > 79 characters) Line 547:80: E501 line too long (84 > 79 characters) Line 549:1: W293 blank line contains whitespace Line 551:1: W293 blank line contains whitespace Line 591:28: E231 missing whitespace after ',' Line 595:33: E231 missing whitespace after ',' Line 599:29: E231 missing whitespace after ':' Line 599:46: E231 missing whitespace after ':' Line 599:61: E231 missing whitespace after ':' Line 599:80: E501 line too long (94 > 79 characters) Line 599:83: E231 missing whitespace after ':' Line 608:1: W293 blank line contains whitespace Line 627:80: E501 line too long (85 > 79 characters) Line 629:80: E501 line too long (86 > 79 characters) Line 630:1: W293 blank line contains whitespace Line 632:1: W293 blank line contains whitespace Line 636:80: E501 line too long (110 > 79 characters)

Line 158:80: E501 line too long (92 > 79 characters) Line 179:80: E501 line too long (96 > 79 characters) Line 200:80: E501 line too long (92 > 79 characters) Line 220:80: E501 line too long (92 > 79 characters) Line 424:5: E265 block comment should start with '# ' Line 425:5: E265 block comment should start with '# ' Line 426:5: E265 block comment should start with '# ' Line 428:5: E265 block comment should start with '# ' Line 455:80: E501 line too long (108 > 79 characters) Line 458:80: E501 line too long (107 > 79 characters) Line 505:80: E501 line too long (82 > 79 characters) Line 506:80: E501 line too long (86 > 79 characters) Line 507:80: E501 line too long (85 > 79 characters) Line 508:29: E117 over-indented (comment) Line 515:1: W293 blank line contains whitespace Line 516:80: E501 line too long (83 > 79 characters) Line 519:80: E501 line too long (84 > 79 characters) Line 533:16: E225 missing whitespace around operator Line 562:33: E231 missing whitespace after ',' Line 564:34: E231 missing whitespace after ',' Line 568:30: E231 missing whitespace after ',' Line 569:30: E231 missing whitespace after ',' Line 571:33: E231 missing whitespace after ',' Line 573:34: E231 missing whitespace after ',' Line 576:29: E231 missing whitespace after ':' Line 576:46: E231 missing whitespace after ':' Line 576:61: E231 missing whitespace after ':' Line 576:80: E501 line too long (94 > 79 characters) Line 576:83: E231 missing whitespace after ':' Line 596:5: E303 too many blank lines (2) Line 602:80: E501 line too long (98 > 79 characters) Line 607:80: E501 line too long (104 > 79 characters) Line 613:80: E501 line too long (99 > 79 characters) Line 614:1: W293 blank line contains whitespace Line 618:1: W293 blank line contains whitespace Line 622:80: E501 line too long (110 > 79 characters)

Line 60:20: E128 continuation line under-indented for visual indent Line 171:80: E501 line too long (84 > 79 characters) Line 306:5: E303 too many blank lines (2) Line 306:36: E231 missing whitespace after ','

Line 16:36: E231 missing whitespace after ','

Line 194:19: E231 missing whitespace after ','

Comment last updated at 2019-05-20 23:06:21 UTC
manodeep commented 5 years ago

@lgarrison I added in the option for not duplicating the particle positions, and for creating a giant array of cell-pairs for the theory pair-counters. Tests passed on my laptop, not valgrind-ed properly, or checked with INTEGRATION_TESTS

lgarrison commented 5 years ago

Cool, thanks! I think we should consider how we want to deal with copying the weights, too. In any situation where it's acceptable to reorder the particle positions, it's probably acceptable to reorder the weights too. So instead of copy_particle_positions, maybe it should just be called copy_particles (since it could be a position copy or a position and weights copy), or even just inplace.

manodeep commented 5 years ago

@lgarrison I was toying with having inplace but then I was having trouble naming the reorder option. I am totally fine with copy_particles (or duplicate_particles...)

lgarrison commented 5 years ago

If adding two parameters is annoying, we could do inplace='allow_reorder' (reorder and don't restore) inplace='keep_order' (reorder but restore), and inplace=False (make a copy).

We could also alias True to one of the reorder options, but I'm not sure which one we think would be more common/useful.

This may not be better than having two parameters, but it might be slightly less verbose...

manodeep commented 5 years ago

As you suggested, I renamed the copy_particle_positions to copy_particles. I think we want to keep the reorder separate, since someone might be re-using the same dataset. By default, we should always shuffle back to the input order.

Unfortunately, if someone selects reorder=False, then the input array for the next iteration is fully sorted, and that means the quicksort within gridlink will become O(N^2). Ideally, we would have an sglib like implementation of timsort but I have not yet encountered any such open-source implentation

manodeep commented 5 years ago

@lgarrison I removed the reorder option entirely. That was causing too much of a conceptual load, plus the possibility of incorrect use. As it stands, if the inplace option has been used, everything will get reordered at the end.

Since the "INTEGRATION_TESTS" get terminated by the OS (both OSX and linux) before running to completion, there is almost certainly some issue (possibly memory leak or out of bounds access) that's remains. I am going to investigate a bit further.

In the mean-time, if you could do a code-review of this PR, that would be fantastic. Particularly, from the prospect of causing memory-leaks (from copy_particles) and from usability (should copy_particles be renamed to inplace, with the defaults changed appropriately)

lgarrison commented 5 years ago

Sure! I'm handing in my thesis next week; should get to it right after that.

On Wed, Apr 17, 2019 at 6:40 PM Manodeep Sinha notifications@github.com wrote:

@lgarrison https://github.com/lgarrison I removed the reorder option entirely. That was causing too much of a conceptual load, plus the possibility of incorrect use. As it stands, if the inplace option has been used, everything will get reordered at the end.

Since the "INTEGRATION_TESTS" get terminated by the OS (both OSX and linux) before running to completion, there is almost certainly some issue (possibly memory leak or out of bounds access) that's remains. I am going to investigate a bit further.

In the mean-time, if you could do a code-review of this PR, that would be fantastic. Particularly, from the prospect of causing memory-leaks (from copy_particles) and from usability (should copy_particles be renamed to inplace, with the defaults changed appropriately)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/manodeep/Corrfunc/pull/173#issuecomment-484288000, or mute the thread https://github.com/notifications/unsubscribe-auth/ABLA7S7MRDV6JAODGRZGKULPQ6RPVANCNFSM4GFA4Q5A .

manodeep commented 5 years ago

@pep8speaks suggest diff

pep8speaks commented 5 years ago

Here you go with the gist !

You can ask me to create a PR against this branch with those fixes. Simply comment @pep8speaks pep8ify.

@manodeep

manodeep commented 5 years ago

@lgarrison (Some?) Memory leaks are now fixed. Running integration tests with the -fsanitize options enabled - hopefully anything else will come up. Added these sanitize options to gcc7.3.0 -fsanitize=leak -fsanitize=undefined -fsanitize=address -fsanitize-address-use-after-scope -fsanitize-undefined-trap-on-error -fstack-protector-all - works great. And much much faster than valgrind.

manodeep commented 5 years ago

@pep8speaks pep8ify

lgarrison commented 5 years ago

@manodeep Can I bump the Travis Numpy version to 1.16 to fix the doctest failure? I think it's a print formatting issue related to this: https://github.com/numpy/numpy/blob/master/doc/release/1.14.0-notes.rst#many-changes-to-array-printing-disableable-with-the-new-legacy-printing-mode

I changed the doctest output to match what I get on my local machine, hence the Travis failure.

Also, any idea why the astropy-bot is unhappy about the changelog? The PR number is in there...

manodeep commented 5 years ago

@lgarrison numpy 1.16 is quite the latest - our numpy dependency is really quite minimal and, if we can, we should be fine with a reasonable version of numpy. I think the current requirement is numpy >= 1.7. Is this simply a white-space issue? Could we change the print_options?

For the astropy-bot failing the test, I am not really sure. May be it's because multiple issues are being addressed in the same PR? @bsipocz, if you have a spare moment, will you please take a look?

lgarrison commented 5 years ago

The current Travis Numpy version is already different from the minimum Corrfunc requirement. We're at 1.12 on Travis but only require 1.7 for Corrfunc. If we want to maintain backwards compatibility for the doctests I'm okay with that, but it does make the examples a fair bit more verbose (cluttered with Numpy print settings). So I'd lean towards just bumping the Travis version past 1.13.

manodeep commented 5 years ago

@lgarrison It's a good point about the travis numpy testing that we are doing. And in some sense, Corrfunc is more likely to break if numpy drops backward compatibility or something. How many tests are affected by this whitespace printing issue? Basically, how much less fluffy code would be required if we bump the travis tests to numpy 1.16? Another option would be to disable doctests with the older version of numpy...

Thanks again for reading through - this PR definitely far too big! I am only mildly confident about nothing too crafty of a bug hiding in this PR because of the -fsanitize+ options with INTEGRATION_TESTS passing.

bsipocz commented 5 years ago

@manodeep - Not sure about the bot. It may be just that something went wrong with the trigger. You could try to retrigger, e.g. add/remove a label or milestone, etc.

bsipocz commented 5 years ago

Also, on the longer term I would suggest to change the travis settings as now it's triggered both as a pr and as a push. I basically just needs a couple of lines of extra logic in the config file. If you open a reminder issue and assign it to me I'll try to get back to it once I'm back in the US from travelling.

lgarrison commented 5 years ago

@manodeep I think it's only one test right now, so not too bad if we want to roll the backwards-compatible solution. I'll push that and see how it works.

@bsipocz Thanks for the tip on eliminating the extra Travis build! I've edited the config file to only build PRs and the master branch (following https://stackoverflow.com/a/31882307). So commits on ordinary branches will not get built; we'll have to open a PR to trigger that. But that may be closer to what we want anyway; when a feature or fix is ready for CI, it's probably mature enough for a PR.

The bot still remains stubborn even after editing the milestones/labels.

manodeep commented 5 years ago

@pllim said that the astropy-changelog bot could be getting tricked because of the multiple entries to the same PR. The full logic is here

Which means we would have to merge the descriptions of the new features into one change (at least what I understood was that reference to one specific PR is one entry in the changes)

manodeep commented 5 years ago

Sadly, my update to the changelog did not fix the bot issue

manodeep commented 5 years ago

@lgarrison If we can't figure out the changelog issue, may be we can skip formally having that check for this v2.3 release. Clearly there are entries in the changelog showing what has been done. What do you think?

lgarrison commented 5 years ago

Success! I used astropy-changelog to directly parse our changelog, and it spit out a few warnings that I tracked down to a few small formatting issues:

I think the last item may have been the killer.

The RST parser wouldn't spit out line numbers for the warnings, which made it much harder to interpret. If anyone knows how to enable this, it could be helpful for the future!

lgarrison commented 5 years ago

@manodeep Travis has passed on the last non-changelog commit, so I think we're ready to merge!

https://travis-ci.org/manodeep/Corrfunc/builds/535043527

lgarrison commented 5 years ago

Just noting that this PR has passed the integration tests on my AVX-512 machine (with gcc's AddressSanitizer).