hdmf-dev / hdmf-common-schema

Specifications for pre-defined data structures provided by HDMF.
Other
3 stars 7 forks source link

Add Measurements #63

Closed CodyCBakerPhD closed 2 months ago

CodyCBakerPhD commented 2 years ago

Summary of changes

Example: I store my electrical (Volts) data in a dataset as int16, and attach a conversion factor of 0.195 and offset of -32768. To coerce to scientific units, take a single data point in the dataset and add the offset, then multiply by the factor. See Intan documentation for the actual use case of this.

First time doing this kind of thing, so let me know what all needs to be changed in addition to the to-do list below.

PR checklist for schema changes

Added to the to-do list

oruebel commented 2 years ago

First time doing this kind of thing, so let me know what all needs to be changed in addition to the to-do list below.

Looks good to me. I added a few suggestions for the schema. I don't think there is anything else in addition to the TODO items you listed that you would need to do here.

CodyCBakerPhD commented 2 years ago

Update the version string in docs/source/conf.py and common/namespace.yaml to the next version with the suffix "-alpha"

For this, in the namespace.yaml, does it apply to both the HDMF Common and HDMF Experimental, or just Experimental? Likewise, in the conf.py, does it apply to both version and release?

rly commented 1 year ago

@CodyCBakerPhD Are you still interested in pushing this forward in HDMF common schema? I think it's a good idea if there is still a clear use case.

CodyCBakerPhD commented 1 year ago

@rly Sorry this has fallen so far by the wayside. If you want to close for bookkeeping I think it's fine as long as we remember it in the extended backlog.

There are definitely clear use cases where we sometimes write TimeSeries-like data onto an indexed column of a table (waveforms being the ones that brought this discussion about). It's not clear from the VectorData object what the units of the series are, but convention so far has been to write the offset/conversion scaled float data to the column (for waveforms, units Volts in that case). There's also the added inefficiency there of upcasting the underlying dtype (say, int16) to float which even with compression amplifies the size of the file by more than necessary.

However, there are and have been so many other higher priorities and the existing downsides aren't so bad that they require immediate fixing

mavaylon1 commented 2 months ago

Closing for "bookkeeping".