Unexplainable NFW sampling discrepancy

astropy / halotools

Python package for studying large scale structure, cosmology, and galaxy evolution using N-body simulations and halo models

http://halotools.rtfd.org

99 stars 63 forks source link

Unexplainable NFW sampling discrepancy #1023

Open EiffL opened 3 years ago

EiffL commented 3 years ago

Hello hello, while writing our own HOD code with @bhorowitz and others we were trying to compare our results to halotools as God's truth, but were finding some residual differences in the 2pt functions that I can't explain.

I have isolated at least one potential thing that could explain our discrepancies, related to NFW sampling of the satellites. And actually, as far as I understand, I seem to be getting inconsistent results within halotools itself depending on whether I sample manually positions with mc_generate_nfw_radial_positions or if I sample satellites for the catalog normally.

I made a minimal demo notebook here. and I'm getting this sort of discrepancy on the radial distribution of satellites :

And this amount of difference actually can explain away the differences I'm seeing downstream in the 2pt function.

EiffL commented 3 years ago

If someone could look at the demo notebook and explain to me why the two histograms don't match, I would be sooooo happy.

aphearin commented 3 years ago

we were trying to compare our results to halotools as God's truth

I just checked the Contributor List, but I don't see anybody with this name on it. If I run a quick "git blame" check on the relevant section of source code, it looks like the populate_mock function was just written by some guy ;-)

Based on a quick inspection of your notebook (which is very clear, thanks!), it looks like the populate_mock function might not be properly calling the Monte Carlo generator of radial positions. This is an important bug to fix before the next release, so I'll get to this as soon as I can.

EiffL commented 3 years ago

Thanks so much!

aphearin commented 2 years ago

I'm still not sure, but it looks like this might be related to the use of a lookup table for the concentration. For some values of the manually-overridden concentration, I get agreement with the NFW reference, and for others I don't. Here are two examples where the left-hand panel is the same diagnostic plot you wrote, and the right-hand panel is just the fractional difference, with sign convention defined by (mock - reference) / reference

conc_bug_conc_2 0 conc_bug_conc_10 0 conc_bug_conc_20 0

aphearin commented 2 years ago

If I change the default spacing in the concentration lookup table from dc=1 to dc=0.25, then the discrepancy gets quite a bit smaller, so I think this is the likely culprit

conc_bug_conc_10 0

aphearin commented 2 years ago

@EiffL here's my branch where I tried the more finely spaced concentrations - https://github.com/astropy/halotools/tree/mockpop_bugfix could you try running your tpcf test to see whether this improves the discrepancy you have been finding?

aphearin commented 2 years ago

Actually, you don't need to use a different branch. In the current master branch, you can test this hypothesis by passing in the concentration_bins argument to the PrebuiltHodModelFactory.

aphearin commented 2 years ago

conc_bug_conc_10 0_fine conc_bug_conc_10 0_coarse

EiffL commented 2 years ago

Thanks a lot @aphearin I tried your fix and it does indeed seem to improve the NFW profile quite a bit. I'm trying to check what happens at the level of the 2pt functitons, and will post plots when I get some convincing results

EiffL commented 2 years ago

Looking good :-) with dc=0.1

with the default dc (I guess 1.0)

so yeah, looks like this is solving the problem! And I understand better why my tests at particular conc worked well, but getting some weird results on the conc from the catalog.

But looks like changing dc makes the sampling code way slower for some reason

aphearin commented 2 years ago

so yeah, looks like this is solving the problem!

Great, thanks for independently confirming that this is the issue. I'm glad this turned out to have a simple resolution.

But looks like changing dc makes the sampling code way slower for some reason

Right, yes, there's a trade-off between the size of the lookup table and the performance. I could try to optimize this a bit and see how it goes, though I think the real solution here is for the NFW profile to be reimplemented to be bin-free using the analytical solution for the CDF inverse. Originally, I had thought that it was a good idea to develop this lookup table machinery since it would be more general, covering cases that had no closed-form analytical solution for the inverse CDF, and thinking that people would want to try out all manner of different profiles. But over time, it became clear that people just wanted to use NFW, and so this feature no longer seems so important.