galsci / pysm

PySM 3: Sky emission simulations for Cosmic Microwave Background experiments
https://pysm3.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
36 stars 23 forks source link

Produce maps from a catalog of sources #143

Open zonca opened 1 year ago

zonca commented 1 year ago

We should implement a generic component that can take a list of coordinates and flux from an input file and generate a map already smoothed with the instrument beam. This can for example be used for radio galaxies.

Looking for an existing implementation that we can port into PySM.

seclark commented 1 year ago

This is a great idea!

giuspugl commented 1 year ago

I have a tool handy that does that , can post a notebook and we can discuss about it.

zonca commented 1 year ago

@giuspugl yes, please, that would be very helpful, no hurry

zonca commented 1 year ago

@giuspugl @xzackli could you please share the code that you have to generate maps from a catalog of point sources? @keskitalo do you have a pointer to the documentation of the new TOAST operator that handles point sources from a catalog?

I would like to gather all the code we already have, then come out with a plan on what we would like to port into PySM.

giuspugl commented 1 year ago

@zonca thanks for refreshing this issue, can surely point you to my code that does that, i can surely help with porting it into pysm3. The code assumes that all sources are unresolved and projected as point-sources with a gaussian profile.

keskitalo commented 1 year ago

The Simons Observatory operator for simulating TOD from a catalog of sources is tersely documented here: https://sotodlib.readthedocs.io/en/latest/toast.html#reference. There is no anchor so you need to search for SimCatalog. The sources can be static, variable or even transient but their position must be held fixed to the Celestial sphere.

zonca commented 1 year ago

@xzackli 's code to generate maps from a catalog is https://github.com/WebSky-CITA/XGPaint.jl/blob/main/src/radio.jl#L254-L329, this puts all the flux for each source into a HEALPix pixel.

zonca commented 9 months ago

see also : https://github.com/simonsobs/pixell/blob/fa8e884d90127282209f6446c2233c319fb73c40/pixell/pointsrcs.py#L35 and https://github.com/amaurea/tenki/blob/master/sim_websky.py

zonca commented 6 months ago

I think we can use pixell for this, see https://www.zonca.dev/posts/2024-05-02-pysm-point-source-pixell for a demo. The low level functionality is implemented in C, it can simulate 10000 sources in half a second on Jupyter@NERSC.

zonca commented 6 months ago

next we need to decide the catalog format.

TOAST uses a TOML file which also includes frequency scaling for sources:

https://github.com/simonsobs/sotodlib/blob/071f2efc2dcd6cf33898a09e235fb004ac4e97b8/sotodlib/toast/ops/sim_catalog.py#L68-L82

@giuspugl I see websky has catalogs as a function of frequency: https://portal.nersc.gov/project/sobs/users/Radio_WebSky/matched_catalogs_2/, do you also have or it is possible to produce a single catalog that assumes a model (or some fit) for frequency dependence?

currently the catalogs have the columns: ['flux', 'phi', 'polarized flux', 'theta']

I think PySM should include a single Websky catalog for Radio Galaxies, already with a pre-defined flux cut to reduce size and with parameters for PySM to estimate on the fly the flux at each frequency. Then we can use pixell to produce beam-smoothed maps.

xzackli commented 6 months ago

Yes, it's possible to generate that kind of catalog for the bright sources. I think the easiest thing to do is actually to fit the existing sources with a log polynomial, since they have some moderately complicated spectral behavior, and we didn't save the original parameters which generated those spectra.

zonca commented 6 months ago

thanks @xzackli, that would be appropriate, would you be able to create that? For testing purposes even having a catalog with just 10 sources would be enough.

giuspugl commented 6 months ago

the plan sounds fine to me. i think that 3rd or 4th order log-polynomial should encompass most of the fluxes we've sampled with websky.

zonca commented 6 months ago

I think I can give it a try and you can review it, should I use the catalogs in https://portal.nersc.gov/project/sobs/users/Radio_WebSky/matched_catalogs_2/?

zonca commented 6 months ago

@msyriac as you have experience with other simulations, do you think "coordinates + coefficients of 4th order polynomial in log(freq) for flux" is ok as a standard representation of a catalog? Or do you have a better suggestion?

msyriac commented 6 months ago

What exactly do you mean by standard here? Existing radio catalogs from e.g. the Sehgal+ and Agora simulations are likely not in that format, but could be post-processed to be.

zonca commented 6 months ago

@msyriac we need to make a decision on the format of this catalog. do you agree with "coordinates + coefficients of 4th order polynomial in log(freq) for flux" or do you have something better in mind? I see that both Sehgal and Agora just give fluxes at some reference frequencies.

zonca commented 5 months ago

@msyriac @giuspugl @xzackli please review https://app.reviewnb.com/galsci/pysm-notebooks/pull/1/ and add comments on that pull request.

For now I took the union of the 20 brightest sources at each frequency for a total of 43 sources, most have a simple shape and fit well, but some of them have a more complicated frequency dependence. See at the end of the notebook the plots in linear and semilog axes.

Do you have suggestions on how to improve the analysis? or would you like to see more sources? Btw, I fell in love with xarray.

@brandonshensley @seclark

giuspugl commented 5 months ago

@zonca thanks a lot for drafting this notebook! I think the fits you're performing are pretty much good overall.

When the SED are almost constant, i.e. there ain't much excursion range , the fit suffer but i would say it is working as well .

image

would be great to make an histogram of the fitted polynomial coefficients, i thin with xarray it is something easy to do.

zonca commented 5 months ago

thanks @giuspugl, see https://gist.github.com/zonca/0138734c3d2aade31b0f836dfb441b3c it looks like the brightest source is a big outlier, so I've created histograms with and without it.

zonca commented 5 months ago

@giuspugl what do you think about the histograms?

giuspugl commented 5 months ago

@zonca thanks for doing this test! i don't see anything wrong with the histograms. logpolynomial coefficients seem pretty reasonable , they span few orders of magnitude depending on the SED.

I've run your notebook making some edit by normalizing the flux wrt the reference frequency employed for the websky catalogs (100 GHz) and it seemed to improve the range of fitted logpoly coefficients, accounting also for the outlier (see the dashed step histograms below).

image

zonca commented 5 months ago

@giuspugl can you please share your notebook?

also, we have 280 million sources, they are too many to simulate all of them. How do we want to implement this? A background pre-smoothed with a tiny beam (to avoid sharp boundaries) linearly interpolated and point sources simulations on top?

msyriac commented 5 months ago

I recommend pre-saving a map containing dim sources (e.g. < 1 mJy or some other threshold) convolved to the smallest beam (e.g. 280 GHz). When loading a map, you can re-convolve (with almxfl) to the target beam, and you can add in >1 mJy sources with your cat2map operation up to the threshold requested. (Some pipelines might not want sources above some threshold, e.g. 10 mJy.. so this flexibility is very nice to have)

zonca commented 5 months ago

@msyriac so for the dim sources, should we keep linearly interpolating between frequencies? or can we use 1 single fiducial frequency dependence? Like some average of the sources?

msyriac commented 5 months ago

I think it would be good to keep linearly interpolating between frequencies; a single fiducial frequency dependence would be too far from realistic.

giuspugl commented 5 months ago

@giuspugl can you please share your notebook?

https://gist.github.com/giuspugl/079cd9d644c3ba37b225e596adfd2428

zonca commented 5 months ago

@giuspugl the units of flux in the catalogs are Jy, right?

zonca commented 5 months ago

ok, for testing purposes I created a catalog with a cut at .1 Jy at 100 Ghz, https://gist.github.com/zonca/8bca58b20c6312fc2f6384f5d6f6027e

it has 2.5k sources

websky_high_flux_catalog.h5.pdf

giuspugl commented 5 months ago

@giuspugl the units of flux in the catalogs are Jy, right? yes

xzackli commented 5 months ago

Your comment makes a good point, we should have added metadata to the .h5 files! Sorry, I haven't been receiving email from this thread but your plan sounds very good.

zonca commented 2 months ago

complete implementation of the catalog feature is merged in #187 next I want to run websky using the catalog and I'm sure I'll find more bugs.