cta-observatory / protopipe

Prototype data analysis pipeline for the Cherenkov Telescope Array Observatory
https://protopipe.readthedocs.io/en/latest/
Other
5 stars 13 forks source link

Comparison with EventDisplay : Modeling #89

Open HealthyPear opened 3 years ago

HealthyPear commented 3 years ago

This is sub-summary issue part of #85 .

This part of the pipeline is more complex and structured than the others because it takes into account both energy and classification. It will be subdivided later on.

Requirements

Reference documents

Known sources of possible divergence and current status:

HealthyPear commented 3 years ago

@GernotMaier : regarding the multiplicity cut, it seems to me that there could be a contradiction between what you said here (at the bottom of the discussion) and what is written on the IRF report (point 1 page 22)

On the report, you say that you keep multiplicity >=2 throughout the analysis, but on the pyirf issue you said that all model training is done after the application of the multiplicity cuts, which in the context of pyirf (aka your cut optimization) is 4. Do you use 4 (3 for 100s and 30min exposure cases) also for the training?

GernotMaier commented 3 years ago

IRF report is from 2017, cuts and different analysis steps changed slightly since cut. Multiplicity cuts are applied at the training stage.

Really depends on the files you are looking at if it is 2, 3, or 4. I usually prepare IRFs for these three cuts; but for most applications use then the 3-tel cut (which provides best balance between sensitivity and resolution)

HealthyPear commented 3 years ago

I see.

So you keep N_tel_reco_direction = 2 up to the model training, then, depending on the analysis/IRF you want to do, you select between, let's say, N_tel_reco_training = 3(or 4 ), and then you keep N_tel_reco_training = N_tel_reco_cuts.

Am I correct?

GernotMaier commented 3 years ago

Multiplicity doesn't influence any of the steps for the BDT training, so assumption is applied before this step on multiplicity. From the BDT training onwards, I keep the multiplicity the same through all analysis steps.