NeurodataWithoutBorders / nwb-schema

Data format specification schema for the NWB neurophysiology data format
http://nwb-schema.readthedocs.io
Other
53 stars 16 forks source link

Add dataset to store list of software that produced a file #578

Closed stephprince closed 1 week ago

stephprince commented 3 months ago

Summary of changes

Fix #319.

Here is a draft proposal for basic provenance information. Given the discussion in #319, I think the goal of the first draft was an easily shareable, approachable representation where the NWB file has an optional field with the names of the software packages and versions used to generate the data in the file.

After following up offline with @rly , we discussed that this information should likely:

  1. be a dataset instead of an attribute since attributes have size limits and these lists could be large depending on the amount of information the user wants to store.
  2. be a string dataset of shape (N, 2) instead of a compound data type to allow more flexible modification in the future if we want to provide the option to save additional information. The con of this approach is that the user will not know the labels without looking at the documentation.

Some further considerations might be:

Related pynwb changes here: https://github.com/NeurodataWithoutBorders/pynwb/pull/1924

Checklist

For all schema changes:

If this is the first schema change after a schema release (i.e., the version string in core/nwb.namespace.yaml does not end in "-alpha"), then:

t-b commented 3 months ago

Sounds like a good idea.

In MIES we have added our own version info since ages a la

image

but having a builtin definition for that is much preferred.

stephprince commented 3 months ago

@t-b great, if I'm correctly interpreting the image you shared, I think all of the MIES version info could be mapped to the proposed (name, version) builtin definition?

e.g. something like:

[('Igor Pro 64bit', '9.0.6.1.56565'),
 ('MIES', 'Release_2.7_20230809-747-g005144'),
 ('Labnotebook', '23'),
 ('HDF5', '1.10.7'),
 ('Sweep Epoch', '9') 
]
t-b commented 3 months ago

I think all of the MIES version info could be mapped to the proposed (name, version) builtin definition?

@stephprince Yes exactly.