scikit-hep / uproot5

ROOT I/O in pure Python and NumPy.
https://uproot.readthedocs.io
BSD 3-Clause "New" or "Revised" License
224 stars 69 forks source link

writing std::vector in TTree output #257

Open dnadeau-lanl opened 3 years ago

dnadeau-lanl commented 3 years ago

We are converting an HDF5 file to a root file using uproot3 with python. We need to convert some data types that uproot3 does not support yet.

At this point we are thinking of using pyROOT or write a C++ program to handle "unsigned integers" and "vector".

Do you plan to add functionality to uproot4 to write "unsigned integer" and "vector" to a root file? Do you think this would be available this spring release?

jpivarski commented 3 years ago

Writing ROOT files (at all) is a goal for this spring. Unsigned integers will definitely be included as will jagged arrays—meaning a variable number of integers per entry. These are also available in Uproot 3: documentation on writing basic TTrees and writing jagged arrays, unfortunately only documented by a PR #477.

However, if you really need the data type to be std::vector<unsigned int>, rather than unsigned int[] (which is variable), that isn't planned. They differ in that the latter has an extra 10 bytes per entry that would have to be understood and written correctly (4 of those bytes are the length of the vector, but the other 6 are a kind of header). If your goal is to have accessible jagged array data in a ROOT file, then unsigned int[] would be sufficient. If your project actually needs std::vector<unsigned int>, then we'll have to look at that.

jpivarski commented 3 years ago

In addition to Uproot 3, another option that exists right now is RDataFrame in PyROOT:

  1. install ROOT with the Python bindings
  2. open Python and import ROOT
  3. create an RNumpyDS to point to NumPy arrays derived from h5py (sorry that I can't give more detailed instructions on that—I haven't done it)
  4. create an RDataFrame (lots of tutorials on this)
  5. its snapshot method writes to ROOT files.
lindsaybolz commented 3 years ago

I am also working on something like this. I am in need of the vector<unsigned int> type. Is it possible to get this on the list of to-do items?

jpivarski commented 3 years ago

Okay, I'll look into the std::vector headers when I work on file-writing.

dnadeau-lanl commented 3 years ago

@jpivarski Do you know if std::vector was added?

jpivarski commented 3 years ago

Uproot4-writing (in general) development is stalled while I get through all the summer conferences/workshops/tutorials. It is still my top-priority programming project. Updates are here:

https://github.com/scikit-hep/uproot4/discussions/321