project8 / katydid

Project 8 data analysis package
Other
5 stars 4 forks source link

Sharing root files #71

Closed guiguem closed 7 years ago

guiguem commented 8 years ago

It seems that the root-tree-writer and the basic-root-writer cannot share root files (create one with basic writer and updating it with tree writer). These two classes should be merged in order to only have one way to interact with a root file...

nsoblath commented 8 years ago

In principle these writers should not be merged, because they operate in different ways. The basic-root-writer's philosophy is to dump a bunch of whatever type of object (e.g. TH1D's for power spectra) to a ROOT file. The root-tree-writer's philosophy is to create a tree for each data type, and to fill it.

My initial thought for adding this feature is to create a base class for ROOT-based writers that has a common file pointer. I haven't thought through all the potential complications, but it seems like the best path forward so far.

guiguem commented 8 years ago

A processor which could merge root files at the end of the execution of a config file might also do the job. ROOT knows how to do that: hadd compiled.root file1.root file2.root will do the job but this might be tedious when a lot of files are generated

laroque commented 8 years ago

I thought about a related issue some time ago. When a run produces data in many files, there's no need to keep those files distinct other than convenience, so combining the outputs is helpful.

I don't see why it should become tedious, it is exactly the sort of repeated task that should be easy to make into a one-liner or wrap in a script.

It seems like ultimately it should be part of the dirac run automation to include a cleanup step that does hadd of this sort. Ideally it would take the places where it automatically generated some list of files (I ask for runs 6-10, it found 50 concatenated mat files for each run, each concatenated file is processed by katydid, hadd is used to combine the 50 outputs for each run and I end up with root_outputrun[6-10].root).

This is very similar to what you're trying (in that it reduces the number of output files) but is also orthogonal (since it combines outputs of the same type from different input data, vs combining the outputs of different types for the same data).

guiguem commented 8 years ago

Indeed! Is there a Dirac issues tracker where you could post that for the Dirac part? :)

nsoblath commented 8 years ago

We do not yet have a good place to track this kind of issue.

On Fri, Sep 30, 2016 at 5:06 PM Mathieu Guigue notifications@github.com wrote:

Indeed! Is there a Dirac issues tracker where you could post that for the Dirac part?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/project8/katydid/issues/71#issuecomment-250878752, or mute the thread https://github.com/notifications/unsubscribe-auth/AAyqJ-sB_VfBTsd2dfSp1y5d5LV_lajqks5qvaQcgaJpZM4J_Y8P .

Noah S. Oblath Staff Scientist Radiation Detection & Nuclear Sciences Group Pacific Northwest National Laboratory

Email: noah.oblath@pnnl.gov

Phone: 509-375-7207

laroque commented 8 years ago

It appears that ladybug is not being used. That seems like it would be the right place if there is a reasonable way to tie it to work actually being done by Vikas/Malachi.

nsoblath commented 7 years ago

ROOT files can now be shared between ROOT-based writers. This feature is in the develop branch as of 0543857974e8942d9212d79ff9a82d7a8c1cf673.

To use this feature, simply supply the same filename to the relevant writers. This feature is thread-safe, so it can be even between writers in parallel processor paths.