icecube / flarestack

Unbinned likelihood analysis code for astroparticle physics datasets
https://flarestack.readthedocs.io/en/latest/?badge=latest
MIT License
8 stars 7 forks source link

Issues with NorthernTracks #245

Closed mlincett closed 1 year ago

mlincett commented 1 year ago

It seems the performance of flarestack with NorthernTracks (NT) v5.1 is bad, at least when using "custom_source_box" for the time PDF.

A strong n_s bias result in a significant underestimation of the injected flux, hence the sensitivity estimation is off (a factor two lower compared to PSTracks).

Federica's AGN cores analysis used the 8yr NT dataset and did not highlight such problem. Further investigations are needed.

mlincett commented 1 year ago

Some more findings:

mlincett commented 1 year ago

It seems that the standard_matrix LLH is not affected by the issue, at least for steady sources.

robertdstein commented 1 year ago

One possible reason: are the sources you are testing overlapping/nearby ones? If I recall correctly, there are scenarios where the underlying assumptions for standard LLH are not valid, so should not be used in all situations.

mlincett commented 1 year ago

One possible reason: are the sources you are testing overlapping/nearby ones? If I recall correctly, there are scenarios where the underlying assumptions for standard LLH are not valid, so should not be used in all situations.

@JannisNe also suggested this. I have a single pair of sources separated by ~1.5 deg. Shall we think about a safety threshold under which flarestack can put out a warning about this?

robertdstein commented 1 year ago

I would be happy to do that, the only problem is there's not really a clear threshold where things break down. I guess when a neutrino is likely to contribute to two sources, but that is a grey area depending on the localisation of the neutrino. It'll also perhaps be hemisphere dependent, since southern events are all well-localised.

mlincett commented 1 year ago

I have run some more benchmarks.

It seems that NT5.1 performs very well with a catalogue of about 65 steady sources but severely underestimates n_s when using custom_source_box as TimePDF.

It's not immediate to crosscheck this with NT2.5 (this catalogue extends beyond the end of NT2.5 data) although I can try to shift the time windows back in time (although older NT2.5 analyses did also use steady time PDF).

In the meantime, please shout if you are aware of anything that may affect time-box fits.

robertdstein commented 1 year ago

I guess there might be a problem in time-box fits? I tested them with PS Tracks, where things worked well, but that's in the "use data as background" regime. This is "MC as background" regime, so I can imagine there might be some weird/unanticipated quirks there. I believe nobody has ever tried to use the configuration you are describing, so nothing has been tested before.

My suggestion would be to disregard the stacking and test the behaviour for a single source and season using a window that starts out as long as the full season and then gets progressively shorter. If that looks fine, then stacking is your problem. On the other hand, if things break already, it's to do with the time PDF stuff.

robertdstein commented 1 year ago

(And also, as a sanity check, make sure that a custom source box of length=1 season matches the steady sources case. They should be identical, but maybe there is a mistake.)

mlincett commented 1 year ago

Thanks to @JannisNe we identified the problem as the missing simulation of event times in Monte Carlo.

I am working on fixing this. Will move the discussion to a PR or a new issue if necessary.