Composite (multi-column) features

Feature request

Structured data types (graphs etc.) might often be most efficiently stored as multiple columns, which then need to be combined during feature decoding

Although it is currently possible to nest features as structs, my impression is that in particular when dealing with e.g. a feature composed of multiple numpy array / ArrayXD's, it would be more efficient to store each ArrayXD as a separate column (though I'm not sure by how much)

Perhaps specification / implementation could be supported by something like:

features=Features(**{("feature0", "feature1")=Features(feature0=Array2D((None,10), dtype="float32"), feature1=Array2D((None,10), dtype="float32"))

Motivation

Defining efficient composite feature types based on numpy arrays for representing data such as graphs with multiple node and edge attributes is currently challenging.

Your contribution

Possibly able to contribute

huggingface / datasets