Closed adematti closed 6 years ago
I can answer the second question now. Yes you are. The bunch size is small, but the process
function will be called multiple times -- when we seen enough number of pairs to fill the buffer for the bunch, the process
function is called once. This reduces the Python function call overhead and allows us to write vectorized numpy operations for many pairs.
The first question I will have to look into.
I agree with you the code looks buggy. l is not the correct index. Also the ordering of the return value of __call__
of YlmCache is undefined, so it may return results in inconsistent sequences on different ranks and cause the result to be wrong when run in parallel. May also cause problems decoding the results to l values.
Do you want to file a PR to fix these? Adding @nickhand to this too.
Hi,
Sorry I was away last week (and will not be very available this week either).
2) I get that the process function can be called several times for each primary (till all secondaries are treated), that is a clever way to proceed. However, I argue that the process function cannot be vectorized along the secondaries here, as you are making the product of alms l. 99 of threeptcf.py (just like (x1+x2+...)(y1+y2+...) != x1y1+x2*y2+... in general). If I set e.g. tree_sec.enum(tree_prim, rmax, process=callback, iprim=iprim, bunch=100) (instead of bunch=100000) l. 111, then the test test_threeptcf.py::test_sim_threeptcf fails. To deal with this one should probably sum all the alms obtained for different chunks of secondaries and make the alms product at the very end.
Concerning the PR, thanks for proposing! Though, I believe it would be faster for you to change the code yourself.
Cheers
I see what you mean line 99 doesn't look right indeed. I will file a PR with these fixes.
@nickhand could you comment on how the current threeptcf_sim_result file is obtained? I only see you updated it in d5dcbcccb955. After implementing adematti's suggested fix I am getting a very different result.
I agree that both of these issues are bugs, but shouldn't the result of the test case be unaffected? I had thought that the fix should produce the same result when bunchsize = 100 as the current code produces with bunchsize=100000.
The test result is produced from Daniel Eisenstein's C++ version of the algorithm, which should be correct. I am sending that code to @rainwoodman right now.
Thanks! I added a few notes about the source of the data files in the test case. turns out the disagreement was because I didn't code it up correctly.
Fixed in #516 516.
Hi,
Willing to implement Slepian et al. 2017 (arXiv: 1709.10150v1)'s anisotropic 3pt correlation function, I started from your code. I noticed two things:
Thank you for your code,
Cheers