HEPData / hepdata_lib

Library for getting your data into HEPData
https://hepdata-lib.readthedocs.io
MIT License
15 stars 36 forks source link

Output with strings instead of floats #264

Closed kleney closed 2 weeks ago

kleney commented 2 weeks ago

I want to include some Sankey diagrams (see attached example) from the recent ATLAS HH→multilepton analysis as HEPData, but I’m running into problems preparing this using the tools available, so I’m wondering if someone here can offer some words of wisdom or pointers in the right direction.

I’ve put all of of the code + data in this CERNBox folder (https://cernbox.cern.ch/s/GxT1OreoeD0Uuif) in case it’s helpful to clarify what exactly I’ve done.

I made the original plots with R so the easiest workaround seemed to be to convert the arrays used in my R dataframe into a TH2F using a small root macro (Alluvial_to_TH2F.C). This makes some very ugly but functional root plots (see other attachment) and writes the 2D histograms to a root file (HEPDataAlluvials.root). I then followed the instructions in examples/reading_histograms.ipynb to prepare TH2F_to_HEPData.ipynb.

This runs and I get an output (e.g. alluvial_outputs/figure_4.yaml) but “HH decay mode” and “Analysis channels” have been converted to floats rather than the strings for each category that I need. How can I get these to be strings (e.g. "HH decay mode" with value = 0.5 should be "4W", "Analysis channel" with value = 4.5 should be "SS2l" etc etc), other than brute force editing of the output file?

channel_flow_ml ml_channels

matthewfeickert commented 2 weeks ago

@clelange If you have time to look at this this week feel free to tag me in for questions as well in addition to @kleney.

GraemeWatt commented 2 weeks ago

In TH2F_to_HEPData.ipynb I think you just need to replace x with x_labels in the line:

hh_decay.values = ml_channels["x"]

and similarly replace y with y_labels in the line:

analysis_channel.values = ml_channels["y"]

I checked this works for me.

kleney commented 2 weeks ago

Perfect! I can confirm it has solved my problem.

Thanks so much for your help!
Katharine