Rescaling protoDC2 stellar mass

aphearin commented 6 years ago

I have finished a complete first draft of the code that rescales the stellar masses in protoDC2. I am raising this Issue so that people can understand the techniques I am using to solve the problem of improving the realism of the DC2 catalog.

Baseline catalogs

@dkorytov has produced protoDC2 "snapshot catalogs" for me to work with instead of the lightcone catalog recently made available to the collaboration. There is one such catalog for each of the 29 snapshots used to generate the lightcone catalog. I will refer to these catalogs as "protoDC2", since the actual lightcone protoDC2 derives entirely from these snapshots.

I am basing the retuning of protoDC2 on mock catalogs based on UniverseMachine, which I have used to populate 100 snapshots of the Bolshoi-Planck simulation. People unfamiliar with the UniverseMachine model can simply think of it as a semi-analytic model inspired by abundance matching. The UniverseMachine model has been calibrated to a very diverse data compilation across a wide range of redshift; the model recovers a wide range of statistical trends of large-scale structure with reasonably high-fidelity (stellar mass functions, quenched fractions, M*- and SFR-dependent clustering). For our purposes, UniverseMachine is serving as a truth table for the galaxy--halo connection that I am using to tune DC2.

Core techniques

To improve DC2, there are two basic methods I am using in concert, both of which are based on the Conditional Stellar Mass Function of UniverseMachine:

Rescale the M* label appearing in protoDC2 using the Halotools implementation of conditional abundance matching (see Section 4.3 of https://arxiv.org/abs/1310.6747 for a mathematical description of CAM). This technique tunes the unit-normalized CSMF in protoDC2 to be more like UniverseMachine.
Resample the galaxy populations so that the CSMF normalization of protoDC2 agrees with UniverseMachine.

In both cases, I make an effort to preserve as much of the raw Galacticus predictions as possible, only making changes that are warranted by the need for realism. For brevity, all results shown here pertain to z=0 snapshot catalogs only, but the same methods seem to work just fine for snapshot catalogs matched at higher redshift.

Rescaling stellar mass

For each snapshot catalog, I bin galaxies according to the halo mass of the distinct parent halo (as opposed to the immediate subhalo). In each host mass bin, I use UniverseMachine to define the CSMF of centrals, and separately the CSMF of satellites, via direct numerical tabulation. If there are Nsat protoDC2 satellites in the host mass bin, I then Monte Carlo draw Nsat times from the satellite CSMF. The protoDC2 satellites with the largest stellar mass in the bin, they get the largest draws from the CSMF; conversely, the protoDC2 satellites with the smallest stellar mass in the bin, they get the smallest draws. I do the same separately for centrals.

The end result of this process is that I have a new M* label for DC2 with a unit-normalized CSMF that agrees with UniverseMachine, by construction. Moreover, in each host halo mass, I have preserved the rank-ordering of stellar mass predicted by Galacticus.

Resampling satellite abundances

In order to get the CSMF normalization correct, I iterate once more over each bin of host halo mass, again using UniverseMachine as a truth table for the correct average number of centrals and satellites. In each bin, I randomly sample (with replacement) the previously-rescaled protoDC2 centrals and satellites. This allows me to set the normalization of the CSMF while preserving the PDF. Each time I select a protoDC2 galaxy, I carry with it all properties associated with it.

Results

Stellar Mass Function

The first plot shows that the SMF of the rescaled and resampled DC2 is in good agreement with UniverseMachine, which is in turn in good agreement with SDSS.

fixed_smf

Satellite Fraction

The next plot compares the satellite fraction as a function of stellar mass. This is one of the most important summary statistics required to correctly capture the observed two-point clustering signal. The green curve gives a clear demonstration of the need for both rescaling and resampling.

fixed_fsat

Three-dimensional clustering

Because I have snapshot catalogs to work with, I can use Halotools to compute the full 3d clustering signal \xi(r). Since projected clustering w_p and angular clustering w(\theta) are projection integrals of this underlying signal, then \xi(r) is a substantially more sensitive summary statistic: mocks with accurate 3d clustering will also have accurate projected and angular clustering.

The four-panel plot below shows the 3d clustering for four models: the original protoDC2, UniverseMachine, protoDC2 after rescaling, and protoDC2 after rescaling and resampling. In each panel, there are four curves, one for each of the following stellar mass thresholds: logM*/Msun > 9.75, 10.25, 10.75, 11.

fixed_sm_clustering

In the top-right panel, we see the correct answer we are trying to obtain after our rescaling/resampling procedure. There are two especially important qualitative features for the M-dependence of clustering: large-scale clustering increases continuously as stellar mass increases, and the small-scale clustering in the 1-halo term steepens considerably as M increases.

In the top-left panel, we see that protoDC2 fails to capture either of these trends. There is almost no M-dependence in large-scale clustering except for the highest stellar mass bin, and the 1-halo term has qualitatively incorrect scaling with M.

In the bottom-left panel, we see that rescaling alone improves the results on large scales, but the 1-halo term is too steep in all stellar masses, and the 1-halo term has virtually no M*-dependence. These failures are remedied by the resampling, as shown in the bottom right panel.

The final four-panel plot below shows the fractional difference between the 3d clustering of UniverseMachine vs (proto)DC2. A vertical axis value of 0.25 corresponds to (proto)DC2 being 25% more strongly clustered relative to UniverseMachine. For all stellar mass and spatial scale, the clustering of rescaled/resampled DC2 agrees with UniverseMachine at the level of 10ish percent.

$fixed_3d_clustering_fracdiff$

Next Steps

Using similar techniques, I have made a similar level of progress on rescaling the star-formation rates of protoDC2. I will try to make and post plots soon.

Once I have locked down M and SFR, my current plan is to use existing Galacticus outputs to remap SEDs onto the rescaled/resampled galaxies. The basic problem is the following. Using the techniques described above, wherein I carry over all Galacticus properties for each resampled galaxy, I now have SEDs that are appropriate for galaxies of the original {M, SFR} values, not the rescaled variables. This needs to be corrected. So what I will do is generate 2-d bins of scaled {M*, SFR} from the original catalog. For each rescaled galaxy, I will randomly draw a Galacticus-predicted SED from the appropriate 2-d bin, and paint that SED onto the galaxy so that its SED is now statistically drawn from the correct (rescaled) variable bin. This will give me a restframe SED for every galaxy, from which flux in any band and redshift can be computed.

Time permitting, I have similar plans to improve the morphologies and sizes of protoDC2 galaxies, but I have prioritized improving SED-derived properties since that seems to be the priority of most working groups.

Tagging people who have been actively participating in discussion on the closely related DESCQA Issue https://github.com/LSSTDESC/descqa/issues/10, and/or on closely related DC2_Repo Issue #20, #30 - @rmandelb, @yymao, @slosar, @vvinuv, @evevkovacs, @j-dr, @katrinheitmann, @salmanhabib, @danielsf, @egawiser, @dkorytov, @duncandc

abensonca commented 6 years ago

The approach to generating viable SEDs seems reasonable. A couple questions:

1) in your resampling I assume that right now each galaxy is treated as a single component (i.e. you do not treat disk and spheroid components separately)? In that case, I suspect we should give some thought about how to assign SED/magnitudes to each component as that information will likely be required by some WGs.

2) the current SED remapping uses M/SFR. Do you think it is/will be necessary to rescale metallicities if the galaxies also, e.g. to get more realistic colors? If so it may be necessary to do this SED remapping in M/SFR/Z

3) do the raw (i.e. prior to rescaling) galaxies span a sufficient region of M*/SFR to allow a reasonable match to all rescaled galaxies to be found?

aphearin commented 6 years ago

I have not thought through the disk/spheroid SED decomposition yet. I figured I would cross this bridge when I (soon) come to it. Ideas welcome.
The reason I chose M/SFR as my basis for rescaling is that I have a well-tuned "truth table" to compare to. For Z that is not the case. I did not figure on rescaling Z, but it would be straightforward to use the carried-over Z and then proceed to remap SEDs using M/SFR/Z 3d cells
Yes, prior to rescaling, the galaxies span the full range of properties, no problem there. For both M* and SFR, the dynamic range gets restricted when applying the rescaling. Good point question though, we'd be sort of sunk if that weren't true.

aphearin commented 6 years ago

The simplest way to get disk/bulge SEDs, and also metallicity, would just be this: when I randomly select an SED from a 2d bin of {M, SFR}, whatever galacticus galaxy is selected, I also grab its disk SED, its bulge SED, and its metallicity. Probably I should just grab everything*. Sizes and B/T will likely need to be rescaled later, though hopefully not resampled.

Doing things this way we at least preserve the relative star-formation history of disk vs bulge as predicted by Galacticus. Also important: it’s easy to implement.

janewman-pitt-edu commented 6 years ago

Can you avoid pulling the SEDs from bins in ( M, SFR ) (e.g., instead draw a random object in a small neighborhood around the generated value, rather than from the bin it falls into)? The discretization that the bins would impose seems avoidable, but furthermore I can imagine biases that result from the binning as the (M, SFR) distribution within the bin will not match the (M*, SFR) distribution resulting from remapping.

rmandelb commented 6 years ago

@aphearin - this sounds really promising. The improvement in the 3D clustering statistics is especially good to see.

Sorry if this is a really naive question, but do we have to worry about cosmological parameters in some of this? That is, the Bolshoi Planck sim that you are using for UniverseMachine has Planck-like cosmological params, while I thought proto-DC2 has WMAP7-like parameters. So their halo mass functions and halo mass vs. bias relations should differ. If you are trying to match the clustering, then presumably the stellar vs. halo mass relations and/or satellite fractions must differ as well? Sorry, I haven't thought this through fully so maybe there is something about this that I am missing.

Thank you for the explanation of your procedure. I believe you have now explained the question that was bothering me when we were discussing this earlier: how can we go from proto-DC2 having very flat clustering with stellar mass to the rescaled+resampled version having more sensible clustering as a function of stellar mass, if you are preserving the rank-ordering of galaxy stellar masses? It's because of

the modification of the satellite fraction, and
the fact that you separately rescale the stellar masses for satellites and for centrals (so correct me if I'm wrong, but I think the statement about preserved rank-ordering can only apply within those populations, but not overall?)

Time permitting, I have similar plans to improve the morphologies and sizes of protoDC2 galaxies, but I have prioritized improving SED-derived properties since that seems to be the priority of most working groups.

With my analysis coordinator hat on: You are correct that groups like LSS and photo-z care about SED-derived properties far more than they care about morphology and size, which only matter at a very basic level (e.g., if the morphologies and sizes are so very wrong that star/galaxy separation starts to fail, or the galaxies are so puffy that the SNR of flux measurements is off by a substantial factor). Groups like WL and CL are doing shear estimation tests, so they care about galaxy sizes and morphologies at roughly a similar level as they care about SED-derived quantities. I expect validation tests to be defined shortly after break.

aphearin commented 6 years ago

@janewman-pitt-edu - Thanks for the comments/suggestions. I agree these are important details to polish, and you are right that there are many things we can do to eliminate edge effects and biases due to hard-edge binning: Gaussian kernel selection, probabilistic weights, nearest-neighbor searches, etc. I do these kind of tricks all the time when the problem calls for it. This virtually always results in more complicated code, and less performant code, and so it's always easiest to start with hard bins and then go back and smooth things out once a pipeline and its unit-testing suite are in place. I will try to get to these smoothing refinements implemented as soon as I can, but I think I should complete the basic pipeline first (e.g., finalizing the rescaling of SFR and morphology, as @rmandelb points out is also important to the WGs).

This reminds me. All my python code for doing these rescalings is a public repository on GitHub in case anyone wants to try out their own variation, or just see how I am doing things in detail.

aphearin commented 6 years ago

@rmandelb - thanks for weighing in. I didn't intend to be cryptic in my earlier reply to you and @yymao in #10, it's just that at the time of your previous question, I was still doing a lot of experimenting and I thought it better to plow ahead rather than explain the morass of different results based on alternative variations of the M*-rescaling. So I'm glad things are making better sense now.

do we have to worry about cosmological parameters in some of this? That is, the Bolshoi Planck sim that you are using for UniverseMachine has Planck-like cosmological params, while I thought proto-DC2 has WMAP7-like parameters. So their halo mass functions and halo mass vs. bias relations should differ.

You are correct about this, and I don't think you're missing anything. It's just that cosmology is only a few percent effect within contemporary uncertainty. Have a look at this plot.

halo_bias_wmap7_vs_planck15

While we may eventually eventually care about percent-level precision for these mocks, we're not there yet. If you look back up at my clustering error plot, even when I put in the CSMF by hand, the clustering only agrees with UniverseMachine at the ten-ish percent level. This is due to assembly bias, which is present in significant levels in UniversMachine. When it comes to the accuracy of predicting galaxy clustering, systematic uncertainty in the true level of assembly bias dominates present-day uncertainty in cosmology, which is unfortunate of course, but for DC2 purposes this means we don't need to worry so much about it since improving this uncertainty is insoluble on this timescale.

Ok, back to remaining comments.

It's because of

the modification of the satellite fraction, and

the fact that you separately rescale the stellar masses for satellites and for centrals

Yes, that is indeed why this works.

(so correct me if I'm wrong, but I think the statement about preserved rank-ordering can only apply within those populations, but not overall?)

You are correct. In fact, the rescaling is done at fixed host halo mass as well, again separately for cens and sats. That is, at fixed host halo mass, the most massive satellites (centrals) in protoDC2 will be the most massive satellites (centrals) in rescaled-DC2.

Groups like WL and CL are doing shear estimation tests, so they care about galaxy sizes and morphologies at roughly a similar level as they care about SED-derived quantities.

Comprende. I've done a lot of empirical modeling of morphology and sizes over the last year. So long as I don't run out of time, I think I can ensure reasonable behavior of the conditional one-point and two-point functions for these properties as well. I'm pretty sure I know how I will go about solving this part of the problem, but I admit I have not attempted this yet. I aim to have analogous demonstrations of those results sometime this week and when I post them I will tag you since you mentioned this.

slosar commented 6 years ago

I have a more philosophical question here: is the point of these rescaling to match the output to be more realistic by the hook or by the crook or is there some physical reason to believe these rescaling are in effect equivalent to rerunning galacticus with better tuned parameters? Because I think DCx is essentially a code- and instrument- testing exercise -- can we get out what we put in, i.e. we're testing the effects of the full system, rather than investigating actual small scale physics. So from that perspective rescalings might be good enough even if non-physical. By the same argument, few percent is accuracy sufficient.

rmandelb commented 6 years ago

@aphearin -

While we may eventually eventually care about percent-level precision for these mocks, we're not there yet.

Yes, I agree.

@slosar -

is the point of these rescaling to match the output to be more realistic by the hook or by the crook or is there some physical reason to believe these rescaling are in effect equivalent to rerunning galacticus with better tuned parameters?

Personally I don't care whether these rescalings are in effect equivalent to rerunning galacticus with better-tuned parameters. What I do care about is that the sims have the basic features we expect from real data, so that we are testing our code with something vaguely sensible. For this reason, we force the sims to have some of these known basic features of the galaxy distribution. But there are some uncertainties in current data, so we also shouldn't drive ourselves crazy over it (i.e., I am going to discourage analysis WGs from setting overly-stringent validation criteria).

With that said, we don't know everything about galaxy properties and the galaxy-halo connection, so we do eventually have to show that we're insensitive to some of the detailed choices made in how the sims are populated with galaxies. I don't think the image sims are the right place to do this because of the expense, but there are plenty of good projects along these lines that could be done with post-processing of the catalog-level simulations. For example it would be interesting for the LSS group to consider a few different methods of populating the DC2 sims with galaxies that reproduce the basic features of real galaxy clustering but show some variation within the allowed range (or even push the boundaries of what we already know), and then check that their linear and NL bias models can cope with all of those options

slosar commented 6 years ago

@rmandelb Ok, we agree here. My comment was more in the direction of Andrew, because I don't think we should be wasting too much time on these once we reach reasonable agreement.

aphearin commented 6 years ago

@slosar - I share your perspective. I'm not sure what comment I made that indicated otherwise, but for the purpose of DC2 I see minimal value in fine-tuning the accuracy to levels below ten percent, and minimal value in worrying about how my rescalings impact the physical assumptions that influence small-scale clustering. The current version of protoDC2 exhibits many scaling relations that are qualitatively incorrect, and fixing the behavior of those relations is important for pipeline testing. But, yes, once the conditional PDFs and two-point clustering show qualitatively reasonable behavior, then I agree that we are done.

cwwalter commented 6 years ago

@aphearin Can I ask a very basic question? This is just me trying to understand, not question what you are doing:

It's a little bit related to Andre's question but, instead of doing re-scaling, why not either:

Work on making the tuning of Galacticus better
Produce a new catalog with UniverseMachine we could use in parallel

Is it that universeMachine doesn't produce as much information as Galacticus and/or with the parameters available it is also too hard to make Galacticus match the distributions you want?

abensonca commented 6 years ago

I'm working on making Galacticus match the data more closely. But, it's a difficult and slow process. Difficult because there's a lot of different datasets we'd like to match and getting some of them to match will require changes to the physics implementations in the model rather than just parameter tuning. And slow because parameter tuning involves evaluating the model at many different points in parameter space and each evaluation is computationally slow.

cwwalter commented 6 years ago

Thanks for the explanation Andrew.

aphearin commented 6 years ago

@cwwalter - re: UniverseMachine. This model makes a much more restricted set of predictions - M* and SFR for every subhalo at every redshift, whereas Galacticus predicts a full SED and also morphology histories. So while UniverseMachine appears to be well-tuned to those observations that it does predict, it's not possible to use it as the baseline DC2 model in its current form.

katrinheitmann commented 6 years ago

@aphearin Given the release of protoDC2_v3 today, can we close this issue?

aphearin commented 6 years ago

@katrinheitmann - yes, this issue is no longer relevant. Closing.

LSSTDESC / DC2-production