root-project / root

The official repository for ROOT: analyzing, storing and visualizing big data, scientifically
https://root.cern
Other
2.7k stars 1.28k forks source link

[ntuple] Support all template parameters of STL containers #16754

Open hahnjo opened 2 weeks ago

hahnjo commented 2 weeks ago

As pointed out by ATLAS, it's currently not possible to use a std::vector with a non-default allocator in RNTuple:

#include <ROOT/RNTupleModel.hxx>

#include <scoped_allocator>
#include <vector>

void ntuple_vector_allocator() {
  auto model = ROOT::Experimental::RNTupleModel::CreateBare();
  model->MakeField<std::vector<int, std::scoped_allocator_adaptor<std::allocator<int>>>>("v");
}

leads to

Processing ntuple_vector_allocator.C...
terminate called after throwing an instance of 'ROOT::Experimental::RException'
  what():  vector<int,scoped_allocator_adaptor<allocator<int> > > is not supported
At:
  ROOT::Experimental::RClassField::RClassField(std::string_view, std::string_view, TClass *) [/home/jhahnfel/ROOT/src/tree/ntuple/v7/src/RField.cxx:1807]

Aborted (core dumped)

This is because RField is only partially specialized for std::vector<ItemT>, so a non-default Allocator falls back to the default RField declaration (which assumes it's a class and checks that it's not in std namespace).

Note that in principle, this affects all STL containers. For many of them, there are other template parameters, for example Hash for std::unordered_set:

#include <ROOT/RNTupleModel.hxx>

#include <unordered_set>

struct IntHash : public std::hash<int> {};

void ntuple_unordered_set_hash() {
  auto model = ROOT::Experimental::RNTupleModel::CreateBare();
  model->MakeField<std::unordered_set<int, IntHash>>("s");
}

leads to

Processing ntuple_unordered_set_hash.C...
terminate called after throwing an instance of 'ROOT::Experimental::RException'
  what():  unordered_set<int,IntHash,equal_to<int>,allocator<int> > is not supported
At:
  ROOT::Experimental::RClassField::RClassField(std::string_view, std::string_view, TClass *) [/home/jhahnfel/ROOT/src/tree/ntuple/v7/src/RField.cxx:1807]

Aborted (core dumped)
hahnjo commented 2 weeks ago

As discussed, for std::vector with a non-default allocator we should likely go through the collection proxy support. For maps and sets, we may already support (via the collection proxies) and it's just a matter of relaxing the partial specializations and maybe the logic in RFieldBase::Create.

dpiparo commented 2 weeks ago

Will the on-disk representation of vectors be the same for all possible allocators? Will it be possible to read back vectors with an allocator which was different than the one used when they were written?

hahnjo commented 2 weeks ago

Will the on-disk representation of vectors be the same for all possible allocators? Will it be possible to read back vectors with an allocator which was different than the one used when they were written?

Very relevant questions; IMHO additional template parameters should not change the on-disk representation of STL containers, and it should be possible to transparently read them back with different parameters than at write time. If nothing is specified and the model is reconstructed, the user gets the default STL container as before. I think this would be compatible with the current state of the specification, otherwise we may need to think this through before declaring 1.0.

jblomer commented 2 weeks ago

To be more precise than just a thumbs up: I also think the on-disk representation should not depend on the allocator.

pcanal commented 2 weeks ago

Will the on-disk representation of vectors be the same for all possible allocators? Will it be possible to read back vectors with an allocator which was different than the one used when they were written?

For reference, this is already the case (for both questions, the answer is yes) for TTree (and thus must be the case for RNTuple)