SpikeInterface / spikeinterface

A Python-based module for creating flexible and robust spike sorting pipelines.
https://spikeinterface.readthedocs.io
MIT License
517 stars 186 forks source link

EDFRecordingExtractor can not be loaded from dict while still open #1228

Closed h-mayorquin closed 1 month ago

h-mayorquin commented 1 year ago

I discovered this while working on #1227. The following code throws an assertion:

from pathlib import Path

from spikeinterface.extractors import EDFRecordingExtractor
from spikeinterface.core.base import _load_extractor_from_dict

file_path = Path("/home/heberto/neuroconv_testing_data/ephy_testing_data/edf/edf+C.edf")

recorder = EDFRecordingExtractor(file_path=file_path)
recorder_dict = recorder.to_dict()
recorder_from_dict = _load_extractor_from_dict(recorder_dict)
  File "/home/heberto/miniconda3/envs/spikeinterface_env/lib/python3.10/site-packages/neo/rawio/edfrawio.py", line 70, in _parse_header
    self.edf_reader = EdfReader(self.filename)
  File "pyedflib/_extensions/_pyedflib.pyx", line 146, in pyedflib._extensions._pyedflib.CyEdfReader.__init__
  File "pyedflib/_extensions/_pyedflib.pyx", line 209, in pyedflib._extensions._pyedflib.CyEdfReader.open
  File "pyedflib/_extensions/_pyedflib.pyx", line 181, in pyedflib._extensions._pyedflib.CyEdfReader.check_open_ok
OSError: /home/heberto/neuroconv_testing_data/ephy_testing_data/edf/edf+C.edf: file has already been opened

So it seems that this extractor can not mantain two open references to the same file. I wonder what should we do in this cases? Should the recorder at neo or spikeinterface throw an assertion if the file is already open informing the user that they have to close the old one?

h-mayorquin commented 1 year ago

This seems to be a problem coming to neo through pyedflib. The following throws an assertion:

from pathlib import Path
from pyedflib import EdfReader
file_path = Path("/home/heberto/neuroconv_testing_data/ephy_testing_data/edf/edf+C.edf")

reader = EdfReader(str(file_path))
reader2 = EDFRecordingExtractor(file_path=file_path)

Note that this works well in separate process though. The following does not throw an assertion:

from pathlib import Path

from pyedflib import EdfReader
file_path = Path("/home/heberto/neuroconv_testing_data/ephy_testing_data/edf/edf+C.edf")

from concurrent.futures import ProcessPoolExecutor

n_jobs  = 8
file_path_list = [(i, file_path) for i in range(n_jobs)]

def initializer(file_path):
    number, file_path = file_path
    print(number, file_path)
    reader = EdfReader(str(file_path))
    print(reader)
with ProcessPoolExecutor(max_workers=n_jobs  // 2) as executor:
    results = executor.map(initializer, file_path_list)
samuelgarcia commented 1 year ago

Thanks for this. This is something known. pyedflib have a problem to open twice the same file. @JuliaSprenger do you have more comments on this ?

alejoe91 commented 11 months ago

@h-mayorquin can we close this and in case move to NEO?

h-mayorquin commented 11 months ago

Yes, I will be fine with this. We should open first the issue in neo and then close this though so we don't forget.

h-mayorquin commented 1 month ago

OK, I opened the issue in neo:

https://github.com/NeuralEnsemble/python-neo/issues/1557

Let's close this.