cta-observatory / cta-lstchain

LST prototype testbench chain
https://cta-observatory.github.io/cta-lstchain/
BSD 3-Clause "New" or "Revised" License
25 stars 77 forks source link

`add_disp_parameters_to_table` copies whole dl1 parameters data 7 times in python for loop #670

Open maxnoe opened 3 years ago

maxnoe commented 3 years ago

https://github.com/cta-observatory/cta-lstchain/blob/2d2d93e4d9c49f6f5d6f48e9de8d0316191cea89/lstchain/reco/r0_to_dl1.py#L601

By adding columns to the existing table, the whole dl1 parameters data is copied by this function. Probably also resulting in larger hdf5 files, if hdf5 does not reclaim the space.

It even does it in a python for-loop over all rows in the table:

https://github.com/cta-observatory/cta-lstchain/blob/c741c9d89ffe20f3a47d5c3f7addf582f26b95b2/lstchain/io/io.py#L808

This has to be very slow.

maxnoe commented 3 years ago

Much slower than actually computing these parameters in the event loop and letting the HDFWriter write them directly.

maxnoe commented 3 years ago

I understood that the these values are computed outside the event loop for efficiency.

But then, when storing them, the code loops over each single event 7 times again constructing python objects from the numpy arrays!

moralejo commented 3 years ago

Can a single table be made with the 7 columns and then add it as a whole to the existing table?