Store templates in Alm instead of maps?

zonca commented 3 years ago

Problem

PySM needs to store the templates at very different resolutions, we want people to be able to do maps at low nside without the need of downloading and processing very high resolution maps.

Current implementation

Currently PySM 3 implements the same method of PySM 2. Input templates are stored at a fixed N_side (512 for PySM 2 models), higher resolution models implemented in so_pysm_models have both 512 and 4096. Any other N_side is computed via ud_grade from the higher resolution map. ud_grade is quick but heavily smoothes the spectrum, in fact both SO and S4 run simulations at 512 and 4096, so effectively never suffering this.

See for example impact of ud_grade on a power law signal with a slope of -2:

"Provide all the maps"

The simplest option is to pre-compute maps using alm2map after clipping the Alms to 3 * N_side - 1:

It is a lot of maps but the N_side 8192 is dominating the storage space anyway, the sum of all other maps is less than half of it. We should provide maps at all N_sides from 8192 down to 128 (Lower resolution for SO)

In this case, we download, cache and read in memory the maps at the requested N_side, then do bandpass integration, finally do beam smoothing. So if we are simulating 1 channel at a time we do 1 map2alm and 1 alm2map. Just as a reference, at 2048 on Jupyter@NERSC map2alm on a IQU map takes 80s, alm2map takes 40s, so about 120s.

"Provide the Alms"

Alternatively we could provide the Alms instead of maps, that would be at variable resolution, for example from 8192 (quoting N_side, I assume ell_max is 3*N_side-1) down to 512. Then we first apply the beam smoothing to all templates with almxfl, then transform to maps with alm2map at the target N_side. So the Sky object now would also have a specific beam instead of having only a specific N_side, but I guess most people run PySM for 1 component and 1 channel at a time. Most components need to multiply maps in pixel space, so we need to transform the templates to maps before running the model.

Then we do bandpass integration on those templates, the output map is already smoothed. For a model like GNILC dust, using N_side 2048 on Jupyter@NERSC, it would be 40s to apply alm2map to the template, 10s and 10s for spectral index and dust temperature.

File sizes

Size for each component in single precision, multiply by 3 for IQU, multiply by 2 for double precision.

Nside	map (float32)	Alm (complex64) 2 Nside	Alm (complex64) 3 Nside - 1
32	49.2 kB	34.3 kB	74.5 kB
64	196.6 kB	134.2 kB	296.4 kB
128	786.4 kB	530.4 kB	1.2 MB
256	3.1 MB	2.1 MB	4.7 MB
512	12.6 MB	8.4 MB	18.9 MB
1024	50.3 MB	33.6 MB	75.5 MB
2048	201.3 MB	134.3 MB	302.0 MB
4096	805.3 MB	537.1 MB	1.2 GB
8192	3.2 GB	2.1 GB	4.8 GB
16384	12.9 GB	8.6 GB	19.3 GB

keskitalo commented 3 years ago

Providing the Alms seems like the best choice, since you always avoid the map2alm calculation. You could theoretically get away with providing only one, high resolution, expansion and only read up to the ell_max necessary for your target N_side. However, since the high resolution expansions are large and unwieldy, it makes sense to provide several files or at least low and high resolution versions.

zonca commented 3 years ago

Discussion on the panexp galactic science telecon

@brandonshensley @bthorne93 @giuspugl @seclark @delabrou

Smoothing before or after evaluating the model is non-commutative. Best do the evaluation of the model first, smooth after, so starting from Alms is not useful
Will go ahead like in #82 to generate templates in Alms with lmax 16384, then clip at 3*N_side-1 and transform to maps at all resolutions down to N_side 128. Lower N_side are just for testing purposes and are ud_graded to.
We need to provide guidelines on how to properly use the code, for example if people want maps at N_side 512, they should run the models at N_side 1024 and transform to target N_side. As we are smoothing at the end, we can easily provide a target_nside parameter, so people can smooth a 1024 maps and get a 512 output without transforming twice.

brandonshensley commented 3 years ago

How would you feel about doing all of this for the user under the hood but having a Boolean keyword where it can be turned off (e.g., to save on computational resources)? I.e., if the requested Nside <= 1024 (or maybe 2048), the 1024 templates are still used but a map of the requested Nside is returned unless the appropriate Boolean argument is passed forcing use of lower Nside templates.

zonca commented 3 years ago

having a target_nside is more configurable. For example someone might want to run at 4096 and save at 2048.
I think printing a warning (via logging) and explain that running under 2048 is not recommended due to spherical transforms introducting artifacts is the cleaner way. I think is better to explain to the user and have them take a choice.
however, we need to be more precise than that, we want to give the users precise guidelines based on nside and beam. Then run some example realizations to check our guidelines are reasonable.

brandonshensley commented 3 years ago

That makes sense, thanks @zonca.

seclark commented 3 years ago

I agree with this, thanks both. (Is target_nside the desired output N_side or the N_side to run at? If you mean the latter I think we should rename.) Would make sense to me to let people configure this (with guidelines) but default to running at 2048 for output maps with N_side <= 1024, and default to run at 2*N_side for output maps with 1024 <= N_side<= 4096. i.e., default to our recommended choices.

zonca commented 3 years ago

thanks @seclark, yes, target_nside is the output N_side, let's call it output_nside then.

We should choose a good name for the N_side we run at, modelling_nside, model_nside, input_nside?

I agree with implementing your recipe about default N_side, I'll think about the interface and make a proposal in #91.

delabrou commented 3 years ago

Perhaps indeed two keywords, nside_run and nside_out…

As an alternative, here is what I do in the PSM. I have a keyword for the simulations (called HEALPIX_NSIDE) and an instrument-dependent pixellization scheme for each instrument, i.e. each observing instrument can have its own nside, ordering, etc. So the sky is generated at some nside and lmax independently of the instruments observing it, and then when the observations are produced, the sky model is band-integrated and beamed for each channel, and can also be degraded (or even upgraded, at the user's risk), rotated to other coordinates, etc. in practice I seldom use that facility, but in some cases it can be handy, e.g. if you want to observe the same sky at 8192 in 7 channels with CMB-S4 LATs, and at nside=256 with 400 PIXIE channels. Also, sometimes it might prove difficult, for an instrument, to generate noise maps when the sky is "observed" if this is not done at non-native nside and pixellisation.

On 20 Oct 2021, at 19:30, Andrea Zonca @.***> wrote:

thanks @seclark https://github.com/seclark, yes, target_nside is the output N_side, let's call it output_nside then.

We should choose a good name for the N_side we run at, modelling_nside, model_nside, input_nside?

I agree with implementing your recipe about default N_side, I'll think about the interface and make a proposal in #91 https://github.com/galsci/pysm/issues/91.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/galsci/pysm/issues/90#issuecomment-948199461, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACXLM5PWDXJRM6NBJVMZOR3UH53ODANCNFSM5GKPZCMQ. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

seclark commented 3 years ago

I like output_nside and model_nside.

@delabrou I think the proposed scheme would allow someone to write a pretty easy wrapper that would "observe" the sky with instrument-specific specs as you propose, right? Maybe that use case can be an example in our documentation.

mreineck commented 8 months ago

Given that the LiteBIRD simulation pipeline would actually like to get the simulated sky emission as a_lm and not as maps, I'd like to come back to this ... As far as I understand, many of the components can be completely processed in the spherical harmonic domain (basically everything that performs linear combinations of one or more template maps according to some functions of frequency). For those, it will be

more economic (a_lm use less memory than a Healpix map which can represent them, and only one set of input a_lm has to be stored)
much faster (no SHTs are required for rotating, smoothing, ...)
more accurate (again, no SHTs are required for rotating, smoothing, ...)

to compute the integrated emission in the desired coordinate system at the desired band limit in a_lm space, and only at the end convert to a map (if the user doesn't want a_lm anyway). I'd be happy to get involved in this if it should go forward.

I don't fully understand @zonca's comment that "smoothing before or after evaluating the model is non-commutative". Is that meant in a numerical sense (I totally agree with this), or in a more fundamental one?

zonca commented 7 months ago

@mreineck unfortunately a model like Modified Black Body for dust multiplies 2 maps, see:

https://github.com/galsci/pysm/blob/b15425310e8c3ffae4a94b08ea8c8aa7c61b8a09/pysm3/models/dust.py#L142-L154

so it needs to be evaluated in pixel domain. We prefer to evaluate the model at a higher Nside compared to what we want the output map to be. For example we evaluate at 8192 and then transform to Alm to 1.5 Nside, smooth/rotate coordinates and finally antitransform to 4096. See https://pysm3.readthedocs.io/en/latest/#best-practices-for-model-execution

This is implemented as the modeling_nside parameter in mapsims https://github.com/galsci/mapsims, see an example execution of this in these map based simulations I produced for litebird: https://github.com/litebird/litebirdXS4/tree/master/20230810_foregrounds_cmb#model-execution.

mreineck commented 7 months ago

Thank you, @zonca!

I am aware that some foreground models need to operate in pixel space; my suggestion would be to switch to pixel space whenever needed and immediately back again afterwards. This also allows a free choice of grid when doing pixel-space operations (a grid with well-defined map2alm transform may be desirable in this context, even though it still loses information due to the nonlinear operations being carried out). Things like mbb_index could be stored as spherical harmonics as well; they presumably have a pretty low band limit and would be quite small objects. Yes, they have to be alm2mapped on the fly, but this (especially with a low band limit) will always be faster than the (potentially iterative) map2alm operation at higher band limits that is required anyway. I think switching to sperical harmonics where possible would greatly reduce memory consumption of the code, as well as improving accuracy (by doing at most one imperfect alm->map->alm cycle in each foreground "pipeline"). Performance might benefit as well, but that is hard to predict. BTW, this is nothing teribly urgent from my point of view ... I'm simply trying to see if this could be an option for the future.

zonca commented 7 months ago

yes, this is already done in the "on-the-fly" models like d11 and s5, see: https://github.com/galsci/pysm/blob/main/pysm3/data/presets.cfg#L158-L175 https://github.com/galsci/pysm/blob/b15425310e8c3ffae4a94b08ea8c8aa7c61b8a09/pysm3/models/dust_realization.py#L10

galsci / pysm