NeurodataWithoutBorders / pynwb

A Python API for working with Neurodata stored in the NWB Format
https://pynwb.readthedocs.io
Other
176 stars 85 forks source link

[Feature]: Validation should check for uniqueness of object IDs in file #1904

Open rly opened 4 months ago

rly commented 4 months ago

What would you like to see added to PyNWB?

Object IDs are intended to be unique within a file. Spyglass and perhaps other data management systems rely on that. Sometimes object IDs may be the same for two different objects because an object was copied. See https://github.com/NeurodataWithoutBorders/nwb_benchmarks/issues/60 for an example.

I think this should be an error on validation and probably before/on write too. However, we have to be careful because there already exists data with duplicate object IDs in a file.

Is your feature request related to a problem?

No response

What solution would you like?

In PyNWB, if there is a duplicate key in the file, calling nwbfile.objects will raise an error like: TypeError: Key 'f5bbf768-f39f-4139-b4dc-08f71abb157d' is already in this dict. Cannot reset items in a LabelledDict..

We could also write a check based on nwbfile.all_children: https://github.com/NeurodataWithoutBorders/pynwb/blob/dev/src/pynwb/file.py#L520-L535

Do you have any interest in helping implement the feature?

Yes.

Code of Conduct

t-b commented 4 months ago

All for it!