SpikeInterface / spikeinterface

A Python-based module for creating flexible and robust spike sorting pipelines.
https://spikeinterface.readthedocs.io
MIT License
493 stars 188 forks source link

[Feature] .from_nwbfile_object method for NwbExtractors #2498

Closed CodyCBakerPhD closed 1 day ago

CodyCBakerPhD commented 6 months ago

Requested by @aidapiccato via Slack

It would be useful for me to have a spikeinterface function that accepts NWBFile objects instead of a file path

I assume it wouldn't be too much work (and is fresh on y'alls minds since the latest refactor) to add a class method .from_nwbfile_object(nwbfile: pynwb.NWBFile) -> Self to both the NwbRecordingExtractor and NwbSortingExtractor?

h-mayorquin commented 6 months ago

If I have a vote, I vote to not suppor this.

In brief, I think the maintenace burden this will add is not worth the convenience gained.

More in detail: 1) The lack of serializability of an in-memory NWBFile is likely to cause problems. Most of our processing methods require serializability and then we will need to tell users that if they want all the features of the library they need to load with a file_path anyway. 2) We already accept open files, which is a work-around if in memory processing is paramount: https://github.com/SpikeInterface/spikeinterface/blob/809541a0a51707023ca3e2ea78c3e44ed54288b3/src/spikeinterface/extractors/nwbextractors.py#L502-L503 3) The class already is too complicated, we support: Paths, IO objects, streaming and three different libraries for reading the data. I think that adding yet more complexity should require a stronger justification than convenience. I am afraid to lift a castle that will collapse on its own weight.

On the other hand, the workflow is not that complicated 1) If the file is in memory, you can save the file and then load it. 2) If you have the file open, you have the path open, use it.

I myself have wanted to have this functionality (for testing) but as the class has grown in complexity I am way less keen on moving in that direction.

At the end this is @alejoe91 decision. In case that you want to support this I suggest that we do it the way that it was requested here in this issue. That is, we could use a static or class method initializer pattern that calls the loading code just after these lines:

https://github.com/SpikeInterface/spikeinterface/blob/809541a0a51707023ca3e2ea78c3e44ed54288b3/src/spikeinterface/extractors/nwbextractors.py#L645-L652

In opposition to the current approach in https://github.com/SpikeInterface/spikeinterface/pull/2506 that adds lumps this feature on the main loading branch.