NeuralEnsemble / python-neo

Neo is a package for representing electrophysiology data in Python, together with support for reading a wide range of neurophysiology file formats
http://neo.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
325 stars 248 forks source link

EDFIO: Alleviate EDF single handle problem #1584

Closed h-mayorquin closed 1 month ago

h-mayorquin commented 1 month ago

This should close #1557

This PR moves the logic of opening the edf reader to the functions that extract the data (_get_analogsignal_chunk).

This avoids the problem of someone initializing the io (in a notebook for example) and then having a failure to initialize it again because the handle was left dangling. Moreover, it would allow parallel processing to work better (but not perfect as it might be concurrent access crashes).

The performance costs are minimal in this case because the io is really slow (I think this can be fixed but that's a matter for another PR). For example, I have this mid-sized file 7000 MiB, 276 channels, and around 20 minutes. It takes 10.8 seconds to make a reading of the whole data in master. With the new changes it takes ... 11.0 seconds:

image

The moral is that when the data extraction itself takes most of the time, the io.open instruction cost does not count.

As mentioned in #1557 the best way would be to handle this with mne library but this should be an improvement over the current state of affair where this io does not play nice with a lot of downstream usage patterns (spike interface re-opens the io to get streams for example which precludes get the streams and then opening the reader again for the extractor).

h-mayorquin commented 1 month ago

I would rather erase close but that is a public method that I did not add. Would break backwards compatibility:

https://github.com/NeuralEnsemble/python-neo/blob/ade13db1cc5be010825b8ec2a44459793868fd9d/neo/test/rawiotest/test_edfrawio.py#L30-L41

If you are OK with that I can do it. I think that is a good thing, when we fix this the close method will stop making sense. For that reason, I also think that adding an open function would be a bad idea.

That said, I changed the private method to be _open_reader instead of open.

h-mayorquin commented 1 month ago

But I would like to deprecate the close method in another PR. I don't want this to derail the conversation when Sam reviews it.

zm711 commented 1 month ago

That all works for me we can think about that later.

zm711 commented 1 month ago

I just changed the title since I'm compiling release notes it just helps me to quickly see which IO it is. This isn't an official repo rule, just a help me :)

samuelgarcia commented 1 month ago

OK for me.