CMB-S4 / s4mapbasedsims

CMB-S4 map based simulations
3 stars 1 forks source link

Sources too bright in DC-0 run Radio Galaxies component #23

Open zonca opened 1 year ago

zonca commented 1 year ago

My processing mask is already built using the raw point source map so residuals like this are unexpected. I looked at the input maps and that source is very hot, 5.9 K_CMB. The brightest radio source in the Planck 30GHz map is 0.018 K_CMB. Some of this is simply due to the difference in beam size but not 300X. Based on the FWMH, I would expect the source amplitude to scale by 20X.

Sources unexpectedly bright:

image (6)
zonca commented 1 year ago

my tests comparing PySM rg1 to Planck was instead fine: https://www.zonca.dev/posts/2022-12-01-radio-galaxies-websky-planck

Investigating

zonca commented 1 year ago

@keskitalo it seems to me that the brightest point source in planck is 0.2 K_CMB, while in DC_0 it is 5 K_CMB, that is a factor of 25 which is not unreasonable given the different beam width.

zonca commented 1 year ago

Notebook I used: https://gist.github.com/zonca/3e337715d9fc6d479ea12a4151860822

zonca commented 1 year ago

sorry, I was comparing a Planck galactic source with a Websky extragalactic source.

If I check the brightest source in Websky at 30 Ghz, it has a flux of 455 Jy, instead if I check in the Planck map, Cygnus A has a flux of 64 Jy.

It is about a factor of 7, it seems reasonable to me, but I would like a confirmation from @giuspugl or @xzackli!

In particular if I compare the catalog at 24.5 GHz (the closest to LFL1), I find that the brightest source is more than 600 Jy, then if I compare with the DC-0 input map, I approximately get the same flux, so I think the processing of the maps is correct, please double-check my notebook here:

https://gist.github.com/zonca/3d65698d73c175754895ee517892cad1

zonca commented 1 year ago

@keskitalo forgot to tag you above

giuspugl commented 1 year ago

i think what we're observing is kinda expected from the Websky models. There is no constrains in the Websky catalogs for the brightest sources to be consistent with what has been observed by Planck. this is mainly due to the fact that we constructed the websky radio galaxies to be totally unconstrained realizations and to match statistically the properties of radio galaxies (e.g. the number counts and the spectral index of the sources). As expected there are few sources in the high flux bin and that might result in producing too bright sources, brighter than what we have anthropically observed (credits to @xzackli for this explanation) . We can work out a hybrid model of rg employing constrained realizations, e.g. the brightest sources observed by Planck and at lower fluxes combine those from Websky.

keskitalo commented 1 year ago

Thank you all. I would guess that a source that is 7 times brighter than any seen by Planck is a 10-sigma outlier. As long as you think there isn't anything obviously wrong with the radio source component, I'll just push ahead with the simulations.

jdborrill commented 1 year ago

I'm a bit confused ... 7 times brighter than anything seen by Planck is obviously wrong, no?

Julian

On Tue, Apr 25, 2023 at 10:27 AM Reijo Keskitalo @.***> wrote:

Thank you all. I would guess that a source that is 7 times brighter than any seen by Planck is a 10-sigma outlier. As long as you think there isn't anything obviously wrong with the radio source component, I'll just push ahead with the simulations.

— Reply to this email directly, view it on GitHub https://github.com/CMB-S4/s4mapbasedsims/issues/23#issuecomment-1522158272, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC4LSRFU5A37FISFHNB7PTXDACR3ANCNFSM6AAAAAAXG75K4U . You are receiving this because you are subscribed to this thread.Message ID: @.***>

keskitalo commented 1 year ago

I mostly meant to check if the extreme sources were expected and it sounds like Giuseppe and Zack are not surprised to see them. I should not try to comment on the physics of blazing radio sources like this, beyond pointing out that they must be extreme statistical outliers since we haven't seen anything like them in the Planck maps.

zonca commented 1 year ago

Once @keskitalo masks the bright sources in the mapmaker and the output maps have no artifacts, the downstream pipelines need to mask them and will not be affected much, right? If this is correct, I would continue using the current model.

The Websky radio galaxy catalog/maps are already published, so we would need to implement into PySM a model which is different from the websky catalog following @giuspugl's idea. We currently have rg1 which is Websky, we need to implement rg2 which is websky with a constrained realization or a high flux cut.

giuspugl commented 1 year ago

we need to implement rg2 which is websky with a constrained realization or a high flux cut.

Sounds good ! The only cons that i see is the lack of correlation between the bright blazars from real data and the sources obtained with websky matter distribution (employed in CIB, SZ, etc.. ). In any case, we have to compromise something...

jdborrill commented 1 year ago

If I understand correctly, the masking is only in applying the filters, and the resulting maps will still have the sources in them.

Julian

On Tue, Apr 25, 2023 at 2:03 PM Giuseppe Puglisi @.***> wrote:

Sounds good ! The only cons of this is that than there 'll be lack of correlation between the bright blazars and the websky matter distribution (employed in CIB, SZ, etc.. ). In any case, we have to compromise something...

— Reply to this email directly, view it on GitHub https://github.com/CMB-S4/s4mapbasedsims/issues/23#issuecomment-1522412214, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC4LSQLXJS3O63PCWZD2ADXDA3YNANCNFSM6AAAAAAXG75K4U . You are receiving this because you commented.Message ID: @.***>

zonca commented 1 year ago

yes, @jdborrill, but I think the downstream pipelines also apply masks

zonca commented 1 year ago

@keskitalo what do you think? do you prefer we work on a rg2 model more well-behaved at high flux?

@giuspugl do you have an estimate of how long it will take to implement rg2?

keskitalo commented 1 year ago

I suspect the current catalog is unphysical and if we proceed with it, we will be stuck with it for a good while. I don't really mind either way, just as long as we come up with a decision soon.

jdborrill commented 1 year ago

Has anyone heard anything from Zack?

I'd lean towards staying with the original catalog and increasing the mask over an ad hoc solution. We can revisit this for DC1 in 6 months anyway.

Julian

On Wed, Apr 26, 2023 at 3:40 PM Reijo Keskitalo @.***> wrote:

I suspect the current catalog is unphysical and if we proceed with it, we will be stuck with it for a good while. I don't really mind either way, just as long as we come up with a decision soon.

— Reply to this email directly, view it on GitHub https://github.com/CMB-S4/s4mapbasedsims/issues/23#issuecomment-1524125830, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC4LSUGUIOZWD5UYAMRUXDXDGP4VANCNFSM6AAAAAAXG75K4U . You are receiving this because you were mentioned.Message ID: @.***>

zonca commented 1 year ago

Yes @xzackli replied via email

The behavior of the brightest sources is a plausible place where the websky radio galaxies would be incorrect; most of my validation against data was with a flux cut, so the brightest sources were masked.

These maps and catalogs were made from a resampling process, so I wonder if the brightest bin is just poorly measured due to being such an extreme value. Giuseppe, do you know of a test I could perform to check this?

This is almost certainly not the explanation, but as an aside: I could see an anthropic reason for a biased selection function on our sky so that there isn't a nearby radio galaxy with its jet pointed directly at us.

jdborrill commented 1 year ago

Since

  1. we'll be referencing the standard WebSky paper/documentation, and
  2. we'll be redoing this in ~9 months anyway

I'd suggest we live with (and document) the overbright source(s)

Julian

On Wed, Apr 26, 2023 at 4:03 PM Andrea Zonca @.***> wrote:

Yes @zack_li replied via email

The behavior of the brightest sources is a plausible place where the websky radio galaxies would be incorrect; most of my validation against data was with a flux cut, so the brightest sources were masked.

These maps and catalogs were made from a resampling process, so I wonder if the brightest bin is just poorly measured due to being such an extreme value. Giuseppe, do you know of a test I could perform to check this?

This is almost certainly not the explanation, but as an aside: I could see an anthropic reason for a biased selection function on our sky so that there isn't a nearby radio galaxy with its jet pointed directly at us.

— Reply to this email directly, view it on GitHub https://github.com/CMB-S4/s4mapbasedsims/issues/23#issuecomment-1524144376, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC4LSW7TGVDOF36JQNY54DXDGSUNANCNFSM6AAAAAAXG75K4U . You are receiving this because you were mentioned.Message ID: @.***>

keskitalo commented 1 year ago

Here is a quick test of masking in the vicinity of the offending source. Instead of just finding the brightest 1% of the pixels in the radio source map, I set a S/N threshold of 30: image. The old threshold corresponded to something like S/N > 100. Tons of new point sources get masked but, surprisingly, only a small number of new pixels around the bright source are masked.

jdborrill commented 1 year ago

That seems surprising.

J

On Fri, Apr 28, 2023 at 2:28 PM Reijo Keskitalo @.***> wrote:

Here is a quick test of masking in the vicinity of the offending source. Instead of just finding the brightest 1% of the pixels in the radio source map, I set a S/N threshold of 30: [image: image] https://user-images.githubusercontent.com/596250/235257008-b927c73c-150f-4d7f-9924-359cc3d1f38e.png. The old threshold corresponded to something like S/N > 100. Tons of new point sources get masked but, surprisingly, only a small number of new pixels around the bright source are masked.

— Reply to this email directly, view it on GitHub https://github.com/CMB-S4/s4mapbasedsims/issues/23#issuecomment-1528113472, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC4LSXUIYM6XEYHYIGPHYTXDQZAFANCNFSM6AAAAAAXG75K4U . You are receiving this because you were mentioned.Message ID: @.***>

keskitalo commented 1 year ago

The source mask in the center has a diameter of 30', whereas the FWHM at this frequency is 7.4'. We are already masking the source at 4.8σ.

jdborrill commented 1 year ago

But that residue was still generating artefacts in the resulting maps?

J

On Fri, Apr 28, 2023 at 3:51 PM Reijo Keskitalo @.***> wrote:

The source mask in the center has a diameter of 30', whereas the FWHM at this frequency is 7.4'. We are already masking the source at 4.8σ.

— Reply to this email directly, view it on GitHub https://github.com/CMB-S4/s4mapbasedsims/issues/23#issuecomment-1528170870, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC4LSR5TFYIZWW45WGP2F3XDRCYHANCNFSM6AAAAAAXG75K4U . You are receiving this because you were mentioned.Message ID: @.***>

keskitalo commented 1 year ago

That is why I characterize it as an extreme source

jdborrill commented 1 year ago

But the residue is apparently much less significant than the previously unregistered sources, so why where't they introducing ringing too?

J

On Fri, Apr 28, 2023 at 4:08 PM Reijo Keskitalo @.***> wrote:

That is why I characterize it as an extreme source

— Reply to this email directly, view it on GitHub https://github.com/CMB-S4/s4mapbasedsims/issues/23#issuecomment-1528181447, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC4LSS7LOTXG3JEENYNZ43XDREXHANCNFSM6AAAAAAXG75K4U . You are receiving this because you were mentioned.Message ID: @.***>

keskitalo commented 1 year ago

Not necessarily true. I'm guessing that the sources were added inside a fixed radius and lowering the threshold even a little captured the last signs of the intense source.

jdborrill commented 1 year ago

Andrea, can you confirm how the maps were constructed from the catalogs?

Thanks,

Julian

On Fri, Apr 28, 2023 at 4:32 PM Reijo Keskitalo @.***> wrote:

Not necessarily true. I'm guessing that the sources were added inside a fixed radius and lowering the threshold even a little captured the last signs of the intense source.

— Reply to this email directly, view it on GitHub https://github.com/CMB-S4/s4mapbasedsims/issues/23#issuecomment-1528199272, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC4LSW4V37QXCZA26QSA6DXDRHRRANCNFSM6AAAAAAXG75K4U . You are receiving this because you were mentioned.Message ID: @.***>

zonca commented 1 year ago

It is produced by Websky using a pixel-domain tool that loops through all the sources in the catalog and adds the flux of each source to the right pixel. Zack told me that in the future he would like to do it pre-smoothed to some beam, but for now has single bright pixels. However, using map2alm_lsq, the smoothed maps look without artifacts.

I think this is the code: https://github.com/WebSky-CITA/XGPaint.jl/blob/main/src/radio.jl#L254-L256

I skimmed the paper but cannot find more details: https://arxiv.org/pdf/2110.15357.pdf

zonca commented 1 year ago

if I get the pixels a couple of arcmin from the brightest source, it looks like all flux is in 1 pixel, with other tiny sources around it: image

giuspugl commented 1 year ago

thanks a lot @keskitalo and @zonca for looking into this. I think we need to prioritize on this. i will work on rg2 in these days , and I am planning to use the LFT -5GHz joint catalog of radio sources which employes a constrained realization of sources at about ~ 50 mJy limit, for fluxes S<50 mJy we could employ the Websky sources. how does that sound ?

However, i foresee there might be an inconsistency on how the websky maps have been created. I have a projection pipeline that works with healpy, but it's way slower than the one @xzackli produced in Julia. @xzacli could you please check this ?

jdborrill commented 1 year ago

Hi Giuseppe,

I think we will need to proceed with rg1 for DC0, to ensure consistency with websky documentation and other products, but address this in time for DC1 within the year.

Best,

Julian

On Wed, May 3, 2023 at 1:15 AM Giuseppe Puglisi @.***> wrote:

thanks a lot @keskitalo https://github.com/keskitalo and @zonca https://github.com/zonca for looking into this. I think we need to prioritize on this. i will work on rg2 in these days , and I am planning to use the LFT -5GHz joint catalog of radio sources which employes a constrained realization of sources at about ~ 50 mJy limit, for fluxes S<50 mJy we could employ the Websky sources. how does that sound ?

However, i foresee there might be an inconsistency on how the websky maps have been created. I have a projection pipeline that works with healpy, but it's way slower than the one @xzackli https://github.com/xzackli produced in Julia. @xzacli could you please check this ?

— Reply to this email directly, view it on GitHub https://github.com/CMB-S4/s4mapbasedsims/issues/23#issuecomment-1532622866, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC4LSWGXZ43K35OCDLKWHTXEIHZPANCNFSM6AAAAAAXG75K4U . You are receiving this because you were mentioned.Message ID: @.***>

zonca commented 1 year ago

@keskitalo can you please share the script you use to create the mask? I would like to have the mask be part of the map based simulation release.

keskitalo commented 1 year ago

Here it is: https://github.com/CMB-S4/s4sim/blob/master/dc0/foreground_sim/setup_sim.py

zonca commented 1 year ago

thanks @keskitalo, where are you sourcing the measurements requirements from? https://github.com/CMB-S4/s4sim/blob/master/dc0/foreground_sim/setup_sim.py#L19

I would like to do the same processing for SAT and SPLAT

keskitalo commented 1 year ago

From the requirements document (private repository): https://github.com/CMB-S4/Requirements2020

zonca commented 1 year ago

note to myself: the file is being updated in https://github.com/CMB-S4/s4sim/pull/29