G-Node / nix

Neuroscience information exchange format
https://readthedocs.org/projects/nixio/
Other
67 stars 36 forks source link

Unified tagging of DataFrames and DataSets #717

Open gicmo opened 6 years ago

gicmo commented 6 years ago

I will write more about how, eventually.

gicmo commented 5 years ago

Currently we have:

Additionally, we have the new DataFrame, a rectangular data container consisting of n columns (name, unit, DataType) by m rows.

Tagging currently is done by having the tag with (multiple) position+extents and pointers to (reference) DataArrays which must match in dimensionality the position and extends.

To allow unified tagging, i.e. DataArray and DataFrame, the references must be changed to either:

The latter is the more complicated, but more flexible solution, while the former is the more straight forward and easier to implement solution.

The common base object could be the existing DataSet, if it were to be extended to include Dimensions and units. The DataView then would need to be amended to include those. The tricky bit would be the dimensions, which would need to include a view (offset+count) applied to the Dimension of the underlying DataArray. The Tags would need to be changed to work only with DataSets for references and retrieveData. Another new object would be needed representing a view of a DataFrame, much like DataView for DataArray: FrameView (name subject to change), implementing a DataSet (i.e. a FrameView is DataSet). The reference in the file format would need to be amended (attributes in hdf5) to specify everything that is needed to re-create that FrameView.