2pt correlation (e.g. ellipticity-direction) for IA model testing

yymao commented 6 years ago

@EiffL, @patricialarsen, and @jablazek have been working on getting 2pt correlation (e.g. ellipticity-direction) for IA model testing.

Can one of you list the items that we should test? And we can see if we need separate one issue for each of them.

Some techniques used here are related to #10 #35 but the purpose and validation datasets would be different.

P.S. @patricialarsen I cannot assign you. Please register your GitHub account to the DESC roster.

[ ] code to reduce mock data
[ ] code that works within DESCQA framework
[x] validation data
[ ] validation criteria

yymao commented 6 years ago

So far we only have a theoretical prediction on w_g+ from @jablazek (plot taken from this run)

@EiffL @patricialarsen any updates on implementing w_g+ (or other 2pt statistics) from mock data?

patricialarsen commented 6 years ago

@yymao I have the shear-shear tests implemented (in real space) in my version of this (theory and data). This seems to work fine, and I am now expanding this to kappa-kappa tests etc. and adding in criteria for passing the test.

yymao commented 6 years ago

@patricialarsen that sounds great! Can you share some plots with us in this issue? If you can do a PR of whatever you have coded up that would be even better! (We do encourage early PR!)

patricialarsen commented 6 years ago

@yymao, yes I'll sort that all out once I have more reliable internet. I'm in Mexico at the moment and will spend the weekend on airplanes so this week isn't ideal! By the way this 2-pt testing is as much for validation of the weak lensing as for IAs.

rmandelb commented 6 years ago

A note about that theory prediction from @jablazek - that must be for the 2-halo term only. If we're comparing the mocks against that, then we'd need to be careful what scales we compare on.

rmandelb commented 6 years ago

@patricialarsen - I wanted to touch base with you about this validation test. A few questions:

You mentioned that this test is for lensing as well as for IAs. If so, I think it would be worth splitting into two tests because they are testing rather different things for which our validation criteria will differ quite a bit. Do you agree, and if so, could you think about the proper split to make and open a new issue/issues as appropriate?
I know that @chihway had done some tests of shear-shear correlations already, and I wanted to know if you have seen that work and might consider incorporating some of it? (or whether there was some obstacle to doing that)

An update on the status of this test / these tests would be useful.

Thanks!

EiffL commented 6 years ago

@rmandelb I'm working on this test, with an ETA by the end of the week. My understanding was that the shear-shear correlations for lensing were already at an advanced stage in https://github.com/LSSTDESC/descqa/pull/54 but I can't seem to find the associated issue...

rmandelb commented 6 years ago

Thanks Francois. What about #35 ? That one is kind of confusing: the issue title says it's shear-shear, and the plots are the lensing correlations, but the first comment says it's an IA test. Does #35 or this issue have an incorrect title and/or a mixture of topics? We should try to label these clearly as to whether they are lensing or IA tests.

rmandelb commented 6 years ago

(please read above comment in github, I fixed an important mistake)

EiffL commented 6 years ago

Yes you're right, I think #35 is the correct issue for the lensing test that @patricialarsen has coded up, it's not using wlpipe though at this stage, that's what got me confused, sorry. I agree, we can update the names of these issues to be more explicit

yymao commented 6 years ago

@EiffL @rmandelb I believe you both have the permission to edit issue titles. Please do when you see fit.

patricialarsen commented 6 years ago

@rmandelb yes, I can confirm that #35 is the right issue for the lensing test, and this is the IA test. It was an early confusion between the two when the issue titles were made. These are definitely two separate tests. I'll respond to your other question on the correct issue to try to avoid confusion.

yymao commented 6 years ago

@EiffL, you mentioned above that you are working on this test. Can you give us an update?

@patricialarsen, are we happy with the theoretical prediction that @jablazek made as the validation dataset?

EiffL commented 6 years ago

@yymao I have the code (from @duncandc) and data (from MB2) for the 3D ellipticity-direction correlation function and I'm implementing wg+ for comparison with @jablazek's validation test.

The question I'm facing now is more about how to incorporate that shape/orientation information to the protoDC2 catalog. I'm thinking of adding an add-on catalog for galaxy orientation, it will contain information about a 3D ellipsoid model of the galaxy, as well as the projection of this ellipsoid on the sky (thus defining an ellipticity and position angle which may replace the random ones defined in the catalog). Given that we want to be able to support several alignment models, each model can have its own add-on catalog. Let me know if that sounds good to you @yymao .

yymao commented 6 years ago

@EiffL thanks for the update. This sounds good a plan. I do have one question though --- it seems that the main purpose of this validation test is to test the add-on catalog rather than the underlying catalog, is that right? It is totally ok to test the add-on catalogs; we just need to be clear about that.

EiffL commented 6 years ago

yep, that's exactly what it means. The underlying catalog really doesn't have any meaningful shape/orientation information (they are randomly drawn). This being said, the test will still be impacted by the underlying catalog in non trivial ways (clustering, mass or luminosity cuts, etc...)

yymao commented 6 years ago

OK, that's totally fine.

Now, the remaining question is if there's anything in the underlying catalog that we should test more directly/specifically for the purpose of implementing IA models? @EiffL @patricialarsen

patricialarsen commented 6 years ago

The more direct test would have to be on the halo shape orientations and the tidal fields.

For the halo shapes we checked the distribution of axis ratios for the halo shape information against other dark matter simulations and they look roughly consistent. That could be implemented as a test, although since we're only providing it for the large halos (and it's independent of the rest of the catalog production) I don't think anyone's actually using this information currently so it's not high priority.

For the tidal fields, I guess we could implement a test, although the fields are dependent only on the dark matter simulation itself for which the matter power spectrum has been well tested. We could check for a bug in the implementation of the tidal fields for each galaxy (which could exist from say incorrect rotations), although it would probably be easiest to just align the galaxies with the tidal field and check that with the IA test.

jablazek commented 6 years ago

@yymao @patricialarsen @EiffL @elisachisari : To answer the earlier question, I think the model we implemented earlier (the NLA model) is a good validation test for the 2pt correlation functions (both shape-shape and density-shape) for scales above ~8Mpc/h. The test could do a 1-parameter fit to the model (i.e. the overall amplitude) and return a chi2/dof and the best-fit amplitude. We could decide on a "pass" criteria based on these numbers. For instance chi2/dof below some threshold and an amplitude within a range deemed plausible.

For smaller scales, we probably need to think more about tests at the 2pt level. For now, it might be enough to ensure that they pass the 1pt tests and the large-scale 2pt test.

As for testing the underlying tidal field and halo shapes, isn't that more a question of validating the simulations and post-processing (e.g. no bug in the halo finder or tidal field code)?

yymao commented 6 years ago

Thank you both, @patricialarsen and @jablazek. So for 2pt tests, I'm hearing we need shape-shape and density-shape correlations. @EiffL are you are working on both of these already, or do you need help? And you are also implementing ellipticity-direction correlation, right? Do we need different issues to track the progress of these 2pt tests?

We have 1pt tests on position angle and ellipticity. Are those sufficient @jablazek?

Validating the simulations and post-processing is also part of this effort of validating extragalactic catalogs, and that's why I wonder if we need more direct tests. Sounds like we might need to test the tidal field and halo shapes. What would be the specifications of these tests, @patricialarsen?

EiffL commented 6 years ago

@yymao Yes, I have both ED and wg+ codes, but nothing to test them with at the moment, I'm working on it. I first have to add orientation information to protoDC2, once that's available, I can run tests (remember that E-D requires the 3D orientation of the major of the 3D ellipsoid of the galaxy, which is not a quantity currently defined in protoDC2).

I'm working with @duncandc on this, we should be able to produce and test 2 different alignment mocks.

jablazek commented 6 years ago

@EiffL : do you also have w++ code? That will be lower S/N than wg+, but may be interesting as well. On large scales, both wg+ and w++ should have the same scale dependence (difference amplitude).

patricialarsen commented 6 years ago

@EiffL - when you say you have nothing to test this with, do you mean the tidal fields? I believe these were added to the catalog for the February meeting

patricialarsen commented 6 years ago

@yymao these 1pt tests are on the randomly assigned ellipticities, not the correlated ellipticities so this might not be the right venue for that question. I think those are sufficient, but the WL group would know more.

EiffL commented 6 years ago

@yymao @duncandc @jablazek @patricialarsen To bring this dicussion back on github (sorry, my bad that I stopped responding on this issue), the only 2 things that will impact the downstream IA mock generation (beyond what I think is already otherwise tested) is the computation of the tidal field, the distribution of satellites in halos (potentially anisotropic), and the halo shape catalog itself I guess (for duncan's method).

The correlation functions that we have been talking about here would test the output IA mock, but only inderectly the cosmo dc2 catalog itself.

@yymao So, should I try to validate these inputs first ? Is the halo shape catalog already validated at some level ?

yymao commented 6 years ago

@EiffL yes we should now validate the three things you mentioned.

EiffL commented 6 years ago

Ok, so, @aphearin I tried to look through your cosmoDC2 repo but couldn't really pinpoint how the satellite galaxies are spatially distributed within a halo in DC2, are they following the host halo shape to some extent ?

EiffL commented 6 years ago

So @yymao I propose the following tests:

Compare the tidal eigen vecs X halo positions in DC2 vs MB2
Compare the halos orientations X halo positions in DC2 vs MB2
Additionally, also 1 pt distributions of halo shapes and tidal eigenvals

It may not be the best things to test but it should catch obvious failure modes. Does that sound good?

aphearin commented 6 years ago

@EiffL - no, in the current version of production, satellites are anisotropically distributed, but there is no correlation with LSS.

EiffL commented 6 years ago

Oh , ok, so they are not following the halo shape (assuming the halo shape has correlation with the LSS)

yymao commented 6 years ago

Sounds good @EiffL. Do you think you can implement these soon or do you need help?

BTW, do we have halo shapes in the catalog?

aphearin commented 6 years ago

@EiffL - yes, that's correct. This is not a necessary feature of DC2, it's just that nobody asked for this feature and so we did not prioritize it. Alternate versions of the extragalactic catalog with such correlations could be made available as needed.

EiffL commented 6 years ago

@aphearin Ok great, thanks for clarifying. Sorry, we have only started thinking about this recently as @duncandc made further progress on his IA model. He'll probably be able to report on the impact of the satellite distribution in more details. If it's not too much trouble it would certainly be a useful feature to have in DC2 (satellite distribution correlated with host halo and LSS), and I know that Duncan has halotools components for that so we should be able to help in the implementation/validation . Also I know we won't have halo shapes below a given mass, but that's ok.

@yymao I'm doing that now, I'll let you know if I run into a problem or need assistance, thanks ! And last time I checked, yes I believe there was halo shape information (but that was maybe prior v3)

evevkovacs commented 6 years ago

I just checked with @patricialarsen and the plan is to have halo shapes and tidal fields in a later version of cosmoDC2 (probably the one coming after the initial version used for image sims)

EiffL commented 6 years ago

Right, ok thanks, that is fine, we don't expect the image simulations to have IA

yymao commented 6 years ago

@EiffL probably should have satellite galaxy angular distribution test too?

EiffL commented 6 years ago

Yep, although at the moment this would just show a flat distribution I guess. So I don't think it would validate anything yet.

yymao commented 6 years ago

yes, I understand. But it's good to prepare for future catalogs. Obviously this test can have lower priority for now.

aphearin commented 6 years ago

@EiffL - as far as the anisotropy correlations go, I think this is probably best handled in a post-processing phase of what will be the baseline DC2 catalog. The techniques @duncandc has developed are tailor made to create an alternate version with all manner of variety of such correlations excluded/included with varying strength. So this feature will be readily available. Also, since the tidal tensor in the ~Mpc environment of every halo can be computed and diagonalized in post-processing, including this satellite correlation feature need not be strictly tied to halo shape information.

EiffL commented 6 years ago

humm, I'm a bit confused by some of the plots I'm getting. I'm looking only at galaxies with step=247, and just looking at the histogram of halo eigen vector orientation, it looks like they are pointing in a preferred direction: I was naively expecting something flat (this is for ~3000 central galaxies with halo masses > 10^12). Am I missing something and it's not so surprising (or maybe not statistically significant....)? @patricialarsen Also, from the orientation-direction correlation I've computed, it looks like hostHaloEigenVector3 may be the major axis, is that correct ? (looks like it's the vector that aligns the most with the LSS). Also, sorry, is the tidal field included in v3, I don't think I see it ?

duncandc commented 6 years ago

Just to follow up to @aphearin's comment, I do think we could re-assign satellite positions in a add-on catalog if this is necessary for mine or @EiffL IA methods.

patricialarsen commented 6 years ago

@EiffL - no, the tidal field is not included in v3. Please continue to use the v2 catalogue for this.

EiffL commented 6 years ago

@patricialarsen Ok thanks :-) no problem

EiffL commented 6 years ago

@duncandc I've plotted the halo orientation-direction correlation function in a test in branch u/EiffL/ia_validation could you take a look at it and tell me what you think: https://portal.nersc.gov/project/lsst/descqa/v2/?run=2018-04-06&test=alignment_test&catalog=protoDC2 That's all at z=1 and host halo shapes in DC2 vs central dark matter shape in MBII. I've also quickly plotted the shape distributions, they are different but that may be due to the fact that we are not 100% sure what shapes we have in MB2:

patricialarsen commented 6 years ago

@EiffL what do you mean by central dark matter shape? If this is a sort of reduced inertia tensor the axis ratios look sensible. These shapes are strongly dependent on the method used to measure them.

Look at for example figure 10 of https://arxiv.org/pdf/1702.03913.pdf - the dashed lines vs the solid lines are the reduced vs. simple inertia tensor. The reduced inertia tensor (which up-weights the dark matter closer to the central point) has significantly higher axis ratios, I have code to compute both although I think I only put the simple inertia tensor results into the catalog - if reduced inertia tensor is more useful I'm happy to go with that instead or as well.

patricialarsen commented 6 years ago

Also, quick suggestion: the direction-position calculation might be interesting to plot. It would validate that the orientations are correctly linked to the density and I think the axis directions are typically more stable to measurement differences than the ellipticities are.

EiffL commented 6 years ago

Thanks for the feedback @patricialarsen . By central dark matter shape I mean shape of the dark matter subfind subhalo around central galaxies in MB2. There are many caveats to that, in particular which inertia tensor we are using (we are not sure :-/ we are in the process of recomputing them, probably with simple inertia tensor, but I'll let you know).

Yes, agreed, the directions are more stable, that's the point of my first plot up there that computes the direction-position correlation function.

LSSTDESC / descqa

2pt correlation (e.g. ellipticity-direction) for IA model testing #42