NeurodataWithoutBorders / lindi

Linked Data Interface (LINDI) - cloud-friendly access to NWB data
BSD 3-Clause "New" or "Revised" License
5 stars 1 forks source link

Resolve dataset of references #36

Closed rly closed 7 months ago

rly commented 8 months ago

It looks like datasets of references are not being resolved on read correctly. When I run examples/exam.py, and add:

    print(nwbfile.electrode_groups["shank0"])
    print(nwbfile.electrodes.group[0]) 

On the second print, I get the error: ValueError: :/general/extracellular_ephys/shank0 has not been built

The RFS JSON looks correct:

"general/extracellular_ephys/electrodes/group/0": "[{\"_REFERENCE\":{\"object_id\":\"57d5b3ab-681a-4ed2-b623-09f21bdb9d99\",\"path\":\"/general/extracellular_ephys/shank0\",\"source\":\".\",\"source_object_id\":\"d1daffd5-055a-4e94-8418-ad57d42f06f0\"}},..."

HDMF manages the building of groups and datasets, using the hdf5 object's id: https://github.com/hdmf-dev/hdmf/blob/244d17a28ed436849b1973a3aaac8522d0ea922b/src/hdmf/backends/hdf5/h5tools.py#L568 If the group or dataset has been read into a builder, it is cached using the hdf5 file name (path) and the hdf5 object's id. For some reason, the id being stored in the cache is different than the id of the reference in the nwbfile.electrodes.group dataset, so the referenced built electrode group object cannot be found.

(warning: the HDMF code around resolving datasets of references is a jumble of abstraction -- I can help debug next week when I have some time)

magland commented 8 months ago

@rly This should be fixed by #39