icecube / event-generator

IceCube event generator
GNU General Public License v3.0
1 stars 2 forks source link

Draft: Enabling training on MCPE pulses #29

Open fschlueter opened 3 months ago

fschlueter commented 3 months ago

With this PR one can use event-generator to train models on MCPE pulses and hence use the generative models for simulations.

In addition to the new feature, this PR introduces a small number of small improvements. For example:

fschlueter commented 3 months ago

@mhuen I would be interested to know what is missing for potentially merging this. There are probably a few shortcoming in the current implementation. For example: I allow to read MCPE pulses from hdf5 files, the function which read data from i3 files (i.e. frames) are not compatible yet.

mhuen commented 3 months ago

I am also in the process of some larger changes in event-generator, including some necessary breaking changes. This is currently being collected in the branch CollectBreakingChanges. The reason I mention this here is because some of these features are also added there, such as the variable label key name. The MCPE stuff may have some further reaching implications, as the pulse type may differ when applying the model. So one would have to think that one through and see if it would work down the line and/or if there are additional places in the code that need adjustment. I do see that you added it as a mutable setting, which is good, but there may be other necessary modifications.

What I have done so far when training on MCPEs, was simply to save the MCPE as reco pulses in hdf5 format. Which is also the workaround I chose for other datatypes that didn't have an icetray hdf converter setup yet. In any case, I am in the process of implementing a number of updates. And since these will necessarily break compatibility to older models, I am also cleaning up the code a little bit, and removing old parts that were added for compatibility. Perhaps it's best to merge these changes to that branch? Then one can also clean up some of the stuff, when not having to worry about backwards compatibility

fschlueter commented 3 months ago

Just a note, we do successful train models on MCPEs. If you meant with changes "down the line" might be necessary to train them, this is working already (the question if the training can be as good as possible might be a different one).

fschlueter commented 3 months ago

In the context of larger changes. So far it seems that the asymmetric gaussians are hardcoded in many places. I would like to test a triple pandel function instead. I think the time pdf model could be made flexible but it is a more inversive change I think. Were you ever planing of doing this?

mhuen commented 3 months ago

It's more about once the model is exported and applied later on. Say you want to test running it on reco pulses or so. Does that work?

mhuen commented 3 months ago

In the context of larger changes. So far it seems that the asymmetric gaussians are hardcoded in many places. I would like to test a triple pandel function instead. I think the time pdf model could be made flexible but it is a more inversive change I think. Were you ever planing of doing this?

I was thinking about testing this. But yeah, this would involve a bit of restructuring of the models. I think the rest of the code should be modular enough that it doesn't matter for them. But the current base source model and derived classes assume the asymmetric Gaussians

mhuen commented 3 months ago

I think the base class is actually fine. It's just the utility functions (pdf, cdf) that assume it. But these are only needed for debugging purposes. Training and application of the model itself uses the get_tensors method and all this requires is a result tensor named pulse_pdf to which the evaluated PDF for each pulse/pe is written to. So one should be able to test this fairly easily, by simply adding a derived class that utilizes a different mixture basis

fschlueter commented 3 months ago

It's more about once the model is exported and applied later on. Say you want to test running it on reco pulses or so. Does that work?

What do you mean with running it on reco pulses? We have not done much with the models yet. Only started evaluating the prediction of the total charge against MC and photonics.

fschlueter commented 3 months ago

I think the base class is actually fine. It's just the utility functions (pdf, cdf) that assume it. But these are only needed for debugging purposes. Training and application of the model itself uses the get_tensors method and all this requires is a result tensor named pulse_pdf to which the evaluated PDF for each pulse/pe is written to. So one should be able to test this fairly easily, by simply adding a derived class that utilizes a different mixture basis

I will leave in a week to Greenland. I probably wont be able to do something in this direction before but I can have a look afterwards.

mhuen commented 3 months ago

Do you remember if the triple pandel function had an analytic CDF? Or at least if it's easy to evaluate with existing tensorflow functions? (We need to have the gradients)