NeurodataWithoutBorders / pynwb

A Python API for working with Neurodata stored in the NWB Format
https://pynwb.readthedocs.io
Other
176 stars 85 forks source link

[Feature]: Add ElectricalSeries should ensure that ElectrodeTable is added as well #1701

Open oruebel opened 1 year ago

oruebel commented 1 year ago

What would you like to see added to PyNWB?

See https://github.com/hdmf-dev/hdmf-zarr/issues/95#issuecomment-1593513925

from pynwb.testing.mock.file import mock_NWBFile
from pynwb.testing.mock.ecephys import mock_ElectricalSeries
from hdmf_zarr import NWBZarrIO

nwbfile = mock_NWBFile()
nwbfile.add_acquisition(mock_ElectricalSeries())

with NWBZarrIO(path="/home/jovyan/Downloads/test_zarr.nwb", mode="w") as io:
    io.write(nwbfile)

Leading to an error because the electrodes table has not been added to the file.

Is your feature request related to a problem?

See above

What solution would you like?

Possible solutions:

Do you have any interest in helping implement the feature?

Yes.

Code of Conduct

bendichter commented 1 year ago

This is tough to handle in the current way mock interfaces are implemented. mock_ElectricalSeries() creates a stand-alone ElectricalSeries not linked to a specific NWBFile. If no electrodes table is provided than an example "electrodes" DynamicTable and associated DynamicTableRegion are created, but since there is no NWBFile, there is no way to ensure that the electrodes table is associated with it. The problem is repeated for the ElectrodeGroup, Device, and for all other situations with links. Under the current system, if you want a writeable NWBFile object, you would have to do something like this:

device = mock_Device()
group = mock_ElectrodeGroup(device=device)
electrodes_table = mock_ElectrodeTable(group=group)
electrodes = DynamicTableRegion(name="electrodes", data=list(range(5)), table=electrodes_table, description="electrodes")
nwbfile = mock_NWBFile(electrode_groups=[group], electrodes=electrodes_table, devices=[device])
nwbfile.add_acquisition(mock_ElectricalSeries(electrodes=electrodes))
with NWBHDF5IO("testsdfse.nwb", "w") as io:
    io.write(nwbfile)

In this case, the mock functions do little more than provide default arguments for the neurodata type classes, and the example here highlights a weakness of how I implemented this. Indeed, this approach is better for situations like ImagingPlane, where OpticalChannels are not linked to by the object but contained within it. mock data types are also helpful for tools like NWB Inspector and NWB Widgets, where in-memory NWB objects are often sufficient and the test data types never need to be written to disk.

I agree that it would be useful to create a system that allows for the easy creation of entire mock NWBFiles that can be written. Perhaps the best way forward for that would be to create a class MockNWBFileFactory (or something similar - this might be an abuse of the name "factory"), which would have syntax like:

class MockNWBFileFactory():
    def __init__(self, nwbfile=None, **kwargs):
        self.nwbfile = nwbfile or mock_NWBFile(**kwargs)

    def create_Device(**kwargs):
        device = mock_Device(**kwargs)
        self.nwbfile.devices.append(device)
        return device

    def create_ElectrodeGroup(device=None, **kwargs):
        device = device or self.create_Device()
        electrode_group = mock_ElectrodeGroup(device=device, **kwargs)
        self.nwbfile.electrode_groups.append(electrode_group)
        return electrode_group

    def create_ElectrodesTable(electrode_group=None, **kwargs):
        electrode_group = electrode_group or self.create_ElectrodeGroup()
        electrodes_table = create_ElectrodesTable(group=electrode_group, **kwargs)
        self.nwbfile.electrodes = electrodes_table
        return electrodes_table

    def create_ElectricalSeries(electrodes_table=None, **kwargs):
        electrodes_table = electrodes_table or self.create_ElectrodesTable()
        electrodes = DynamicTableRegion(
            name="electrodes",
            data=list(range(5)),
            table=electrodes_table,
            description="electrodes"
        )
        electrical_series = mock_ElectricalSeries(electrodes=electrodes, **kwargs)
        self.nwbfile.add_acquisition(electrical_series)
        return electrical_series

Then the following should work

nwbfile = MockNWBFileFactory().create_ElectricalSeries().nwbfile

with NWBHDF5IO("testsdfse.nwb", "w") as io:
    io.write(nwbfile)

This approach would require us to write a create_* method for each neurodata type.

bendichter commented 1 year ago

Another option as @oruebel mentioned is to modify pynwb:

nwbfile.add_acquisition(electrical_series) or ElectricalSeries should ensure that when it an ElectricalSeries is added to the NWBFile that the electrodes are added as well

I guess you could modify these lines:

https://github.com/NeurodataWithoutBorders/pynwb/blob/a6fa9170f2ad6942d74eaf44c22cedf0b80f0fcd/src/pynwb/file.py#L851-L856

You could create methods for automatic assignment of linked neurodata types within the neurodata classes themselves.

class ElectricalSeries(...):
    ...
    def auto_add_links(self, nwbfile):
        nwbfile.electrodes = nwbfile.electrodes or self.electrodes.table

and then call this method whenever a neurodata object is added:

class NWBFile(...):
    ...
    def add_acquisition(self, **kwargs): 
        nwbdata = popargs('nwbdata', kwargs)        
        nwbdata.auto_add_links(self)
        self._add_acquisition_internal(nwbdata) 
        ...

This approach could work for any time that we can make reasonable assumptions about where linked data should go in the NWBFile. e.g. this would work for ElectricalSeries, ImagingPlane, and VoltageClampSeries but would not work for a generic DynamicTable linking to another DynamicTable.

CodyCBakerPhD commented 1 year ago

In this case, the mock functions do little more than provide default arguments for the neurodata type classes, and the example here highlights a weakness of how I implemented this.

I don't mind the current implementation of the Mock classes at all, but also not opposed to something even more convenient

I was just using mock classes here as a quicker demonstration, but my main complaint is just that the HDMF stack trace was completely uninformative with respect to a relatively simple problem

The simplest solution overall might be what Oliver says

nwbfile.add_acquisition(electrical_series) or ElectricalSeries should ensure that when it adds an ElectricalSeries to the NWBFile that the electrodes are added as well

with a small addendum that I don't know if it should automatically add the table simply because I don't know how deep it would go - the electrode table should have links to ElectrodeGroups, and those should link to a Device, so I guess it would have to recursively add all of them? Seem like a lot of work

I'd be perfectly happy with a simple error (or warning for back-compatibility?) that says "The electrode table referenced by ElectricalSeries 'name_of_electrical_series' has not been added to the NWB file! Unable to add this electrical series to the file."

There is of course the issue brought up that this is all with respect to an in-memory NWBFile object, which is the only thing that knows what objects has been attached to it, but it also...

My request put simply is that I should not be able to add an in-memory ElectricalSeries to an in-memory NWBFile through the method nwbfile.add_acquisition, or analogous methods in processing modules, if the electrode table referenced by the series has not been added first. This more strictly enforces the ecephys structure we have always supported and provides an early, concise notification about a problem that would otherwise be raised on io.write() in a relatively ambiguous way.

bendichter commented 1 year ago

@CodyCBakerPhD

My request put simply is that I should not be able to add an in-memory ElectricalSeries to an in-memory NWBFile through the method nwbfile.add_acquisition, or analogous methods in processing modules, if the electrode table referenced by the series has not been added first.

I agree, this would be a valuable feature. I just want to point out that implementing this is tricky - just adding (or checking) the electrodes table would not fix the issue. You would need to add the electrodes table and the electrode group, and the device. Otherwise, you'd get essentially the same error with a different orphaned neurodata object. It might be possible to do this in a more general way if all you are doing is checking. Maybe something like:

class NWBContainer():
    def check_linked_objects(nwbfile):
        for obj in linked_objects:
            if obj.parent is not nwbfile:
                warn(f"linked object {obj.name} has not yet been added to the nwbfile")

I guess there are 3 possible solutions here (not mutually exclusive):

  1. Provide a more informative error when linked objects are not added to NWBfile
  2. Automatically add linked objects when we can make reasonable assumptions about how to do so
  3. Create a mock system that correctly assembles an NWBfile
CodyCBakerPhD commented 1 year ago

I just want to point out that implementing this is tricky - just adding (or checking) the electrodes table would not fix the issue. You would need to add the electrodes table and the electrode group, and the device.

Exactly, but I was then hoping that a similar attempt to call nwbfile.electrodes = some_premade_table would likewise complain "Unable to add the electrode table as it does not link to any ElectrodeGroups!" (I think calling nwbfile.add_electrodes will actually do something like this already?)

For a really general solution that would even apply to the PhotonSeries analog of this arugment, all I need from your list is (1)

Provide a more informative error when linked objects are not added to NWBfile

bendichter commented 1 year ago

So after meeting with the team, we decided to address all three:

  1. Create a warning if linked objects are not already present in an NWBfile.
  2. Automatically add linked objects to an NWBFile if we know where they should go.
  3. enhance the mock functions with an optional nwbfile arg. If it is provided, automatically add to that nwbfile. (https://github.com/NeurodataWithoutBorders/pynwb/issues/1705)