Open Zehvogel opened 9 months ago
This looks like a root issue, where the minimal reproducer should be to effectively write a branch with an std::array
. It could also be that it fails only if the std::array
is part of a struct
as that is what it is in our case. (And then the branch is actually a vector
and not just single elments)
Looks like a reproducer is not so easy to produce...
I made a struct
struct test {
std::array<float, 6> arr;
ROOT::VecOps::RVec<float> vec;
ROOT::VecOps::RVec<ROOT::VecOps::RVec<float>> vecvec;
std::array<std::array<float, 6>,1> arrarr;
ROOT::VecOps::RVec<std::array<float,6>> vecarr;
}
and use it like this:
import ROOT
df = ROOT.RDataFrame(10)
code = """
struct test {
std::array<float, 6> arr;
ROOT::VecOps::RVec<float> vec;
ROOT::VecOps::RVec<ROOT::VecOps::RVec<float>> vecvec;
std::array<std::array<float, 6>,1> arrarr;
//ROOT::VecOps::RVec<std::array<float,6>> vecarr;
};
"""
ROOT.gInterpreter.Declare(code)
df = df.Define("test", "test()")
print(df.Describe())
df.Snapshot("test_tree", "test_file.root")
df2 = ROOT.RDataFrame("test_tree", "test_file.root")
print(df2.Describe())
returning:
[---]
Column Type Origin
------ ---- ------
test test Define
[---]
Column Type Origin
------ ---- ------
arr[6] array<float,6> Dataset
arrarr[1][6] array<array<float,6>,1> Dataset
test test Dataset
test.arr[6] ROOT::VecOps::RVec<Float_t> Dataset
test.arrarr[1][6] ROOT::VecOps::RVec<Float_t> Dataset
test.vec ROOT::VecOps::RVec<float> Dataset
test.vecvec ROOT::VecOps::RVec<ROOT::VecOps::RVec<float> > Dataset
vec ROOT::VecOps::RVec<float> Dataset
vecvec ROOT::VecOps::RVec<ROOT::VecOps::RVec<float> > Dataset
[---]
uncommenting the vecarr
member fails when writing to disk complaining about not having a dictionary:
Error in <TStreamerInfo::Build>: The class "test" is interpreted and for its data member "vecarr" we do not have a dictionary for the collection "ROOT::VecOps::RVec<array<float,6> >". Because of this, we will not be able to read or write this data member.
Ah, sorry. That was a bit ambiguous from my side. What we write is more like this
struct test {
std::array<float, 6> values;
};
std::vector<test> data;
So probably something along the lines of
import ROOT
ROOT.gInterpreter.Declare("""
#include <array>
#include <vector>
struct test {
std::array<float, 6> values{};
};
std::vector<test> makeTestData() {
return std::vector<test>(10);
}
"""
# all the rest
If you want to make it slightly more complicated, this would also be allowed in a podio generated EDM
struct ArrayStruct {
std::array<int, 42> arr;
};
struct SomeData {
ArrayStruct a{};
};
Ok, thx. With your construct (the simpler one) it still complains about not having a dictionary but in another way.
Error in <TTree::Branch>: The class requested (vector<test>) for the branch "test_col" is an instance of an stl collection and does not have a compiled CollectionProxy. Please generate the dictionary for this collection (vector<test>) to avoid to write corrupted data.
Ah right, you will probably have to create a dictionary then for at least test
then. I am not sure if there is an easy way to get that done via the python bindings, or whether it is necessary to bring in some heavier (CMake) machinery.
I have no idea either, guess I will have to postpone this for now as there is more important stuff to do...
@Zehvogel I'll file an issue at root-project/root and link here.
Back reference: https://github.com/root-project/root/issues/14790
I've had a look at this and it seems it's only an issue with .Describe()
. Working with std::array
works fine, accessing different indexes, creating new variables, filtering... So a storage type change is not needed.
This affects more than just .Describe()
. It prevents also for example df.AsNumpy(['BuildUpVertices.covMatrix'])
. In fact, I think it prevents any access to values stored in a std::array
from python.
root's RDataFrame
.Describe()
feature fails with edm4hep files containing vertex collections with the following error message:It is maybe not our fault but I wanted to submit the issue here first
source /cvmfs/sw-nightlies.hsf.org/key4hep/releases/2024-02-05/x86_64-almalinux9-gcc11.3.1-opt/key4hep-stack/2024-02-05-osrelo/setup.sh