hdmf-dev / hdmf-schema-language

The specification language for HDMF
https://hdmf-schema-language.readthedocs.io
Other
0 stars 2 forks source link

No support for arrays in compound data types #8

Closed bendichter closed 1 year ago

bendichter commented 3 years ago

HDF5 allows you to have vectors of up to 4 dimensions within compound datatypes (ref), but our DTypeSpec does not allow us to have a shape parameter, so we cannot use this feature. I propose that we extend our schema language so that we can put vectors inside of compound data types.

oruebel commented 3 years ago

You are correct, currently HDMF only supports flat compound types, i.e., nested compound types (which h5py does not support) and compound types with arrays are not supported. Is there a particular use-case that requires an array in a compound type? I suspect that supporting compound types with arrays may be tricky in Matlab and other languages (e.g., R and Igor). Also, I believe these arrays must have a fixed shapes and are limited to be 4D, so additional support validation may be required to make this work.

bendichter commented 3 years ago

The use-case was out of a convo with @rly about spatial coordinates here. We want to create a non-dynamic table where one of the elements is x,y[,z] position, and another is roll, pitch[, yaw] orientation. We could make this as 6 distinct members of the compound type, but I think a more elegant solution would be to package the position and orientation together as a (3,) vectors.

I agree, it makes sense to think about how this would generalize to other programming languages and other backends as we consider this change. I am not aware of any specific constraints on the MATLAB side, but you are right it would be good to check with them before adding this.

oruebel commented 3 years ago

I think a more elegant solution would be to package the position and orientation together as a (3,) vectors.

Independent of how the compound type is defined in the file, I think on the user-facing API side this should be possible to do.

We want to create a non-dynamic table where one of the elements is x,y[,z] position, and another is roll, pitch[, yaw] orientation.

Makes sense. However, before going down this route, I would suggest to do some investigation of how complicated it will be support compound types with vectors or n-D arrays, to make sure we are not causing more problems than the convenience this adds. One slight advantage of just using a compound type of scalars is that the components have defined names, whereas when you us a vector in a compound type you can only give the vector as a whole a name.

rly commented 1 year ago

@bendichter on further review, I think this would add a fair amount of complexity and development effort to the schema language, APIs in Python/MATLAB/other, and future backend support (e.g. HDMF-Zarr). I do not think it is worth the change at this time. Feel free to reopen if you disagree or there are more use cases.

bendichter commented 1 year ago

I don't remember opening this nor why I wanted it so yes thanks for closing :-)