Open mmaneyro opened 1 year ago
Dear @mmaneyro ,
Thank you for your report. I take it that Event.NParticles
is a sub-branch of the Event
branch. What you describe is not really surprising, as Redefine
is meant to substitute the values of the full column of the RDataFrame (column==branch). The difference in behaviour between non-jitted and jitted code is more surprising though. As a fast workaround, you could be more explicit about the columns you want to save in your output TTree by adding the list of column names to the Snapshot
call
auto snap = df2.Snapshot("LHEF", "out_snapshot.root", {"Event.NParticles"});
In order for me to better reproduce your problem though, I believe I would also need some instructions on how to generate the dictionaries for the classes in your file. Meanwhile, I can try to come up with a simpler reproducer, but having also your scenario would help.
Cheers, Vincenzo
Dear Vincenzo,
I have already managed to work with the redefined trees I need, just with a number of workarounds.
The tree files in this case are generated from Les Houches event files using the ExRootLHEFConverter from ExRootAnalysis. As such the branches are custom classes, which can be found in the ExRootAnalysis source files. I can't actually snapshot individual columns without gettting an error as there are TClonesArray column headers which specify the structure of the branches. The obvious fix would be to snapshot the column plus the header, but then that also gives me an error.
I understand that Redefine is ideally used for columns, however I need to be able to apply different redefinitions to different leaves within a branch. Do RDataFrames just not support rewriting leaves/nested columns? The columns seem to actually be doing what I'd like before snapshotting.
It seems like there's not a simple solution where I get to benefit from using RDataFrame and keep the tree structure untouched. I need to be able to add rows of data to each entry within a leaf (I'm actually concatenating multiple trees), and TTrees don't allow this as far as I can see. I guess I could define a new TTree by hand, setup the branches and fill new arrays from my original trees with the redefinitions I need,(just by iterating over every entry and data value). But then I'm still changing my TTree stucture, as with snapshot. Maybe next time I'll just start by rewriting ExRootLHEFConverter to take the data from two .lhe files, or just stick to TTrees, but to be fair this project has been my first attempt at using ROOT/C++. You code you learn!
What I am trying to do may be a bit of a niche use case, but I hope some of what I wrote is useful to you.
Regards, Marina
Check duplicate issues.
Description
Behavior: Snapshot warns that an illegaly named column will be renamed when writing to file. Then the column appears twice, with the new name and the original. Renamed leaves now appear outside of their original branch
Expected behavior: Only the renamed column appears in the saved tree, respecting the original tree structure.
Reproducer
pp_3j_LO_H_T_2_35GeV.root.tar.gz
ROOT version
ROOT 6.28/00
Installation method
built from source
Operating system
Linux Mint 21.1 Cinnamon
Additional context
No response