Reconstructed energy is written in logscale while true energy in linear scale

HealthyPear commented 3 years ago

This is because we write to file the direct estimation from the energy regressor which is decided by the target value in the model, log10_true_energy by default.

This creates 2 problems:

classifier features related to reconstructed energy need then to be written as

log10_reco_energy: reco_energy # Averaged-estimated energy of the shower

which is horrible

benchmarking code is not elastic enough so results can seem wrong but only because cuts are done in the wrong scale...

kosack commented 3 years ago

I think the solution is to allow a transformation to normalize/re-scale the predicted variable. E.g. the predicted value should always be "energy" (not log10_energy), but you should have an option

transform: np.log10
inverse_transform: lambda p: 10**p

And then during training you call the transform so all computations are in log10_energy, and after predict call the inverse transform to go back to energy. You could also include the scaling there to TeV, unless that is just assumed that energies are in TeV.

So the sequence of steps is:

input training data → transform → train
input testing data → predict → inverse_transform → prediction

The same could even be used for input data to the training, if you really want to be general. I.e. you could allow a column name + transform + inverse_transform for all variables (e.g. intensity → log10(intensity) → training) However, I guess the user-defined features solve that problem, so it's probably only needed for the input/output parameter

HealthyPear commented 3 years ago

Just a small clarification: I found this problem only now because in the previous AdaBoost config the true target was true_energy and not log10_true_energy so the estimated value was always in linear scale (not sure if this was also one of the factors for which resolution was bad before)

cta-observatory / protopipe

Reconstructed energy is written in logscale while true energy in linear scale #139