There is currently no schema-wide standard for associating measured data with the measurement unit. I like the method used in TimeSeries.data, which uses unit (as in measurement unit, e.g. 'meters') conversion and resolution. This seems like the minimum structure needed to be both flexible and specific. On the other hand there are parts of the schema where measurement units are not treated with so much care, and several places where units are indicated as part of the doc string of the attribute, which is inflexible, not machine-readable, and inconsistent with the convention used in TimeSeries.data. I would like to see the standard used in TimeSeries.data used as a standard across the schema, including data elements that are not collected across time. The advantage of this is it will give us an NWB-wide standard for indicating measurements that can be used in other places (e.g. wavelength parameters in ophys) and in extensions.
Specifically, I propose the addition of a new Measurement dataset neurodata_type with attributes unit, conversion and resolution and dtype: numeric (depends on https://github.com/NeurodataWithoutBorders/pynwb/issues/594). For the units field, it would be ideal if we could enforce standards for accepted unit strings. @nicain pointed me to the unidata units database, which seems like a good standard. We could check and make sure the units are an accepted term or abbreviation so that they are machine readable.
It would be ideal if this was used in place of TimeSeries.data. This would not necessarily change the user interface for TimeSeries, but would break reading of all TimeSeries objects written before the change! Since we are now past the deadline for schema-breaking changes, I'll propose a compromise: add Measurement now, and incorporate it fully into the schema for the 3.0 release.
There is currently no schema-wide standard for associating measured data with the measurement unit. I like the method used in
TimeSeries.data
, which usesunit
(as in measurement unit, e.g. 'meters')conversion
andresolution
. This seems like the minimum structure needed to be both flexible and specific. On the other hand there are parts of the schema where measurement units are not treated with so much care, and several places where units are indicated as part of the doc string of the attribute, which is inflexible, not machine-readable, and inconsistent with the convention used inTimeSeries.data
. I would like to see the standard used inTimeSeries.data
used as a standard across the schema, including data elements that are not collected across time. The advantage of this is it will give us an NWB-wide standard for indicating measurements that can be used in other places (e.g. wavelength parameters in ophys) and in extensions.Specifically, I propose the addition of a new
Measurement
dataset neurodata_type with attributesunit
,conversion
andresolution
and dtype:numeric
(depends on https://github.com/NeurodataWithoutBorders/pynwb/issues/594). For the units field, it would be ideal if we could enforce standards for accepted unit strings. @nicain pointed me to the unidata units database, which seems like a good standard. We could check and make sure the units are an accepted term or abbreviation so that they are machine readable.It would be ideal if this was used in place of
TimeSeries.data
. This would not necessarily change the user interface forTimeSeries
, but would break reading of allTimeSeries
objects written before the change! Since we are now past the deadline for schema-breaking changes, I'll propose a compromise: addMeasurement
now, and incorporate it fully into the schema for the 3.0 release.