cta-observatory / pyirf

Python IRF builder
https://pyirf.readthedocs.io/en/stable/
MIT License
15 stars 25 forks source link

optimize_gh_cut theta cut input isn't really used #260

Open kosack opened 12 months ago

kosack commented 12 months ago

Just a question: optimize_gh_cut asks for the theta cuts to be passed in and then applies them internally to get a cut mask, which is simply anded with the rest of the cuts, but the function doesn't otherwise doesn't need or optimize them. Wouldn't it be better to have the user apply them before calling the function like other cuts, which is more flexible?

I would expect this function to optimize hg cuts after any previously applied cuts, which could include theta. A future function could of course do both gh, multiplicity, and theta optimization, but that would have a different name.

maxnoe commented 12 months ago

The reason it works like this is the limited statistics of background events.

You don't estimate the background by just applying the theta cut around an assumed source position, since that would remove too many background events.

Instead, you count all background events and scale them to the size of the on region given by the theta cuts.

So the theta_cuts are needed by optimize_gh_cuts, they are not only applied to the signal mc events but also used for that scaling in the call to estimated_background here:

https://github.com/cta-observatory/pyirf/blob/c005ea06074c58d388ef4181d5d08634d308742a/pyirf/cut_optimization.py#L119-L126

maxnoe commented 12 months ago

If you have a better idea for how to do this, I'd be very open. This is how event display and I think also CTA-Mars handled this.

kosack commented 12 months ago

Yes, using a larger OFF region than ON is standard to keep statistics, though I think using all background events may not be a good idea, as you might get some biases from the ones near the edge of the FOV - usually you should still apply some radial cut, just larger. Of course, it's not so critical here since the real background IRF will come from data, not simulations.

But the main issue I'm mentioning is more about being explicit rather than implicit. I would expect to apply (or not) the theta cuts to signal and background before running the optimization, that way I have control over what is happening, and it's clear where cuts are happening. The only reason I would see to pass in the theta cuts separately to an optimization function would be if they are being optimized (but then they are an output not an input). [edit: like in #204 ]

maxnoe commented 12 months ago

though I think using all background events may not be a good idea,

It's not all background events, it's the events inside fox_offset_min, fov_offset_max.

usually you should still apply some radial cut, just larger.

That's, at least as far as I understood, not what EventDisplay did. They took cone sections in the FoV, not some larger radius around an assumed (or multiple assumed) source positions.

Of course what you propose would be possible, but this function was really based on the EventDisplay approach.

But the main issue I'm mentioning is more about being explicit rather than implicit. I would expect to apply (or not) the theta cuts to signal and background before running the optimization

How do you do that though? You need to count the events surviving the gamma / hadron cut your are optimizing and then scale that number to the on region size. That needs the on region size.