Closed RikkelBob closed 1 year ago
What is your recording?
print(recording)
BinaryRecordingExtractor: 64 channels - 32.0kHz - 1 segments - 210,713,088 samples 6,584.78s (1.83 hours) - int16 dtype - 25.12 GiB
Can you share the full script?
import spikeinterface as si
import spikeinterface.extractors as se
import spikeinterface.preprocessing as spre
import spikeinterface.sorters as ss
import spikeinterface.postprocessing as spost
import spikeinterface.qualitymetrics as sqm
import spikeinterface.comparison as sc
import spikeinterface.exporters as sexp
import spikeinterface.widgets as sw
import probeinterface as pi
import utils
import docker
import shutil
import matplotlib.pyplot as plt
import numpy as np
from pathlib import Path
import os
import warnings
from probeinterface.plotting import plot_probe
fs = 32000
n_chan = 64
dtype = 'int16'
snippet = False
reorder = True
plot_timeseries = False
animal = "008"
session = "2023-04-13_13-56-33"
root_path = "C:/CheetahData/bench/"
file_name = "ksData_probe3.dat"
file_path = Path(root_path, session, file_name)
file_path_pre = Path(root_path, "preprocessed", session)
probe, positions = utils.gen_probe()
print(probe)
sorter_name = "mountainsort5"
output_folder = sorter_name + "-results"
recording = si.read_binary(file_path, fs, n_chan, dtype)
if snippet:
start_time_s = 0 # start time in seconds
end_time_s = 600 # end time in seconds
start_frame = start_time_s * fs # start frame
end_frame = end_time_s * fs # end frame
# Create the new recording
recording = recording.frame_slice(start_frame, end_frame)
recording.annotate(is_filtered=False)
recording.set_probe(probe, in_place=True)
channel_ids = recording.get_channel_ids()
fs = recording.get_sampling_frequency()
num_chan = recording.get_num_channels()
num_segments = recording.get_num_segments()
print(f'Channel ids: {channel_ids}')
print(f'Sampling frequency: {fs}')
print(f'Number of channels: {num_chan}')
print(f"Number of segments: {num_segments}")
w_ts = sw.plot_timeseries(recording, order_channel_by_depth=True, channel_ids=probe.device_channel_indices)
recording_f = spre.bandpass_filter(recording, freq_min=300, freq_max=6000)
w_f = sw.plot_timeseries(recording_f, order_channel_by_depth=True)
recording_cmr = spre.common_reference(recording_f, operator="median", reference="global")
w_cmr = sw.plot_timeseries(recording_cmr, order_channel_by_depth=True)
if plot_timeseries:
plt.show()
if os.path.isdir(file_path_pre):
recording_saved = si.read_binary(Path(file_path_pre,"traces_cached_seg0.RAW"), fs, n_chan, dtype)
recording_saved.set_probe(probe,in_place=True)
else:
recording_saved = recording_cmr.save(folder=file_path_pre,n_jobs=1, format='binary')
recording_saved.set_probe(probe,in_place=True)
print(ss.installed_sorters())
sorting = ss.run_sorter(sorter_name=sorter_name,
recording=recording_saved,
output_folder=output_folder,
docker_image=True,
verbose=True) # add num_workers?
print('end')
@RikkelBob I am confused about your trace, it seems that you are callling main (which is the script that you shared with us?) from within a multiprocessing call?
Is running the script that you just shared above causing you an error in the following line?
recording_saved = recording_cmr.save(folder=file_path_pre,n_jobs=1, format='binary')
If so, can you run the following script in your system to see if this generates th error? (warning, this will generate a folder of 25 GiB):
from spikeinterface.core.generate import generate_lazy_recording
from probeinterface import Probe
import spikeinterface.preprocessing as spre
import numpy as np
full_traces_size_GiB = 25.0
large_recording = generate_lazy_recording(full_traces_size_GiB=full_traces_size_GiB)
fs = large_recording.get_sampling_frequency()
binary_recording = large_recording.save()
recording = binary_recording
start_time_s = 0 # start time in seconds
end_time_s = 600 # end time in seconds
start_frame = start_time_s * fs # start frame
end_frame = end_time_s * fs # end frame
end_frame = min(end_frame, recording.get_num_frames()) # make sure it does not go over the end
# Create the new recording
recording = recording.frame_slice(start_frame, end_frame)
recording.annotate(is_filtered=False)
recording_f = spre.bandpass_filter(recording, freq_min=300, freq_max=6000)
recording_cmr = spre.common_reference(recording_f, operator="median", reference="global")
recording_saved = recording_cmr.save(n_jobs=2, format="binary")
@h-mayorquin I modified your example, which now reproduces the error on my side. I think it is a minimal code to reproduce the problem.
from spikeinterface.core.generate import generate_lazy_recording
from probeinterface import Probe
import spikeinterface.full as si
import numpy as np
full_traces_size_GiB = 5.5
large_recording = generate_lazy_recording(full_traces_size_GiB=full_traces_size_GiB)
fs = large_recording.get_sampling_frequency()
binary_recording = large_recording.save(folder="base")
recording = binary_recording
start_time_s = 0 # start time in seconds
end_time_s = 600 # end time in seconds
start_frame = start_time_s * fs # start frame
end_frame = end_time_s * fs # end frame
end_frame = min(end_frame, recording.get_num_frames()) # make sure it does not go over the end
# Create the new recording
recording = recording.frame_slice(start_frame, end_frame)
recording.annotate(is_filtered=False)
recording_hp = si.filter(recording,btype='highpass',band=300)
recording_cmr = si.common_reference(recording_hp, operator="median")
recording_saved = recording_cmr.save(folder="preprocessed", n_jobs=16, total_memory="2G", progress_bar=True,chunk_duration='1m')
It passes the first save
function - large_recording.save(folder="base")
),but crashes on the second - recording_cmr.save(folder="preprocessed", n_jobs=16, total_memory="2G", progress_bar=True,chunk_duration='1m')
.
python test.py
write_binary_recording with n_jobs = 1 and chunk_size = 30000
write_binary_recording: 100%|############################################################################################################################| 49/49 [00:39<00:00, 1.23it/s]
write_binary_recording with n_jobs = 16 and chunk_size = 30517
write_binary_recording: 0%| | 0/48 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/lustre/groups/colonneselab/v01-spikesorting-GAopt-20230814/test.py", line 28, in <module>
recording_saved = recording_cmr.save(folder="preprocessed", n_jobs=16, total_memory="2G", progress_bar=True,chunk_duration='1m')
File "/SMHS/home/rath/.local/lib/python3.10/site-packages/spikeinterface/core/base.py", line 749, in save
loaded_extractor = self.save_to_folder(**kwargs)
File "/SMHS/home/rath/.local/lib/python3.10/site-packages/spikeinterface/core/base.py", line 825, in save_to_folder
cached = self._save(folder=folder, verbose=verbose, **save_kwargs)
File "/SMHS/home/rath/.local/lib/python3.10/site-packages/spikeinterface/core/baserecording.py", line 444, in _save
write_binary_recording(self, file_paths=file_paths, dtype=dtype, **job_kwargs)
File "/SMHS/home/rath/.local/lib/python3.10/site-packages/spikeinterface/core/core_tools.py", line 314, in write_binary_recording
executor.run()
File "/SMHS/home/rath/.local/lib/python3.10/site-packages/spikeinterface/core/job_tools.py", line 400, in run
for res in results:
File "/SMHS/home/rath/.local/lib/python3.10/site-packages/tqdm/std.py", line 1182, in __iter__
for obj in iterable:
File "/SMHS/home/rath/.local/lib/python3.10/concurrent/futures/process.py", line 575, in _chain_from_iterable_of_lists
for element in iterable:
File "/SMHS/home/rath/.local/lib/python3.10/concurrent/futures/_base.py", line 621, in result_iterator
yield _result_or_cancel(fs.pop())
File "/SMHS/home/rath/.local/lib/python3.10/concurrent/futures/_base.py", line 319, in _result_or_cancel
return fut.result(timeout)
File "/SMHS/home/rath/.local/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.__get_result()
File "/SMHS/home/rath/.local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
Again this happened after the system update and I suspect that it may be because of OpenMP and some problem with gcc libraries.
@h-mayorquin @alejoe91 here is a conundrum!
The code below runs just fine
# import spikeinterface.full as si
# si.set_global_tmp_folder("spikeiteface.cache")
from spikeinterface.core.generate import generate_lazy_recording
from probeinterface import Probe
import spikeinterface.preprocessing as spre
import numpy as np
import os
os.system('rm -fR base preprocessed')
full_traces_size_GiB = 1.
large_recording = generate_lazy_recording(full_traces_size_GiB=full_traces_size_GiB)
fs = large_recording.get_sampling_frequency()
binary_recording = large_recording.save(folder='base')
recording = binary_recording
start_time_s = 0 # start time in seconds
end_time_s = 600 # end time in seconds
start_frame = start_time_s * fs # start frame
end_frame = end_time_s * fs # end frame
end_frame = min(end_frame, recording.get_num_frames()) # make sure it does not go over the end
# Create the new recording
recording = recording.frame_slice(start_frame, end_frame)
recording.annotate(is_filtered=False)
recording_f = spre.bandpass_filter(recording, freq_min=300, freq_max=6000)
recording_cmr = spre.common_reference(recording_f, operator="median", reference="global")
recording_saved = recording_cmr.save(n_jobs=-1, folder='preprocessed')
$ python test-spikeinterface-orig.py
write_binary_recording with n_jobs = 1 and chunk_size = 30000
write_binary_recording: 100%|###############################################################################################################################| 49/49 [00:21<00:00, 2.32it/s]
write_binary_recording with n_jobs = 40 and chunk_size = 30000
write_binary_recording: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 49/49 [00:38<00:00, 1.29it/s]
However, if I import spikeinterface.full as si
, without even using it, it crashes!
import spikeinterface.full as si
# si.set_global_tmp_folder("spikeiteface.cache")
from spikeinterface.core.generate import generate_lazy_recording
from probeinterface import Probe
import spikeinterface.preprocessing as spre
import numpy as np
import os
os.system('rm -fR base preprocessed')
full_traces_size_GiB = 1.
large_recording = generate_lazy_recording(full_traces_size_GiB=full_traces_size_GiB)
fs = large_recording.get_sampling_frequency()
binary_recording = large_recording.save(folder='base')
recording = binary_recording
start_time_s = 0 # start time in seconds
end_time_s = 600 # end time in seconds
start_frame = start_time_s * fs # start frame
end_frame = end_time_s * fs # end frame
end_frame = min(end_frame, recording.get_num_frames()) # make sure it does not go over the end
# Create the new recording
recording = recording.frame_slice(start_frame, end_frame)
recording.annotate(is_filtered=False)
recording_f = spre.bandpass_filter(recording, freq_min=300, freq_max=6000)
recording_cmr = spre.common_reference(recording_f, operator="median", reference="global")
recording_saved = recording_cmr.save(n_jobs=-1, folder='preprocessed')
$ python test-spikeinterface-orig.py
write_binary_recording with n_jobs = 1 and chunk_size = 30000
write_binary_recording: 100%|###############################################################################################################################| 49/49 [00:21<00:00, 2.33it/s]
write_binary_recording with n_jobs = 40 and chunk_size = 30000
write_binary_recording: 0%| | 0/49 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/lustre/groups/colonneselab/test-spikeinterface-orig.py", line 36, in <module>
recording_saved = recording_cmr.save(n_jobs=-1, folder='/local/preprocessed')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/SMHS/home/rath/.local/lib/python3.11/site-packages/spikeinterface/core/base.py", line 749, in save
loaded_extractor = self.save_to_folder(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/SMHS/home/rath/.local/lib/python3.11/site-packages/spikeinterface/core/base.py", line 825, in save_to_folder
cached = self._save(folder=folder, verbose=verbose, **save_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/SMHS/home/rath/.local/lib/python3.11/site-packages/spikeinterface/core/baserecording.py", line 444, in _save
write_binary_recording(self, file_paths=file_paths, dtype=dtype, **job_kwargs)
File "/SMHS/home/rath/.local/lib/python3.11/site-packages/spikeinterface/core/core_tools.py", line 314, in write_binary_recording
executor.run()
File "/SMHS/home/rath/.local/lib/python3.11/site-packages/spikeinterface/core/job_tools.py", line 400, in run
for res in results:
File "/SMHS/home/rath/.local/lib/python3.11/site-packages/tqdm/std.py", line 1182, in __iter__
for obj in iterable:
File "/SMHS/home/rath/.local/lib/python3.11/concurrent/futures/process.py", line 602, in _chain_from_iterable_of_lists
for element in iterable:
File "/SMHS/home/rath/.local/lib/python3.11/concurrent/futures/_base.py", line 619, in result_iterator
yield _result_or_cancel(fs.pop())
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/SMHS/home/rath/.local/lib/python3.11/concurrent/futures/_base.py", line 317, in _result_or_cancel
return fut.result(timeout)
^^^^^^^^^^^^^^^^^^^
File "/SMHS/home/rath/.local/lib/python3.11/concurrent/futures/_base.py", line 456, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/SMHS/home/rath/.local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
Note, this problem appeared after the operating system was updated on our cluster. They are currently using RockyLinux 8.8 GeneralCloud image. This behavior can be reproduced in a virtual machine using qenu-kvm:
wget https://download.rockylinux.org/pub/rocky/8/images/x86_64/Rocky-8-GenericCloud-Base.latest.x86_64.qcow2
virt-install --disk Rocky-8-GenericCloud-Base.latest.x86_64.qcow2 --memory 8192 --vcpus 4 --cloud-init --os-variant rocky8 --import --name RockyLinux-8.8-CloudBase
rocky
user: passwd rocky
dnf install python3.11 python3.11-pip python3.11-wheel
poweroff -n
shared-drive
sudo mount -t virtiofs shared-drive /mnt
cd /mnt
pip3 install --user 'spikeinterface[full]'
python3 (name of the script)
import spikeinterface.full as si
python3 (name of the script)
Any help with this issue is highly appreciated! -rth
@rat-h Hi, thanks for looking deeper into this. I am on leave right now so I don't have enough time to reproduce the error with a vritual machine (I read your script in my system with and without the full import and it runs fine).
It is indeed very strange that doing the full import generates the error. The first question is, do you need it? I personally never use it on my development and as you probably know is an anti-pattern. That said, it still reveals that there is something fishy going on. I suggest the following:
@h-mayorquin sorry to bother you on your leave.
I read your script in my system with and without the full import and it runs fine
YES! Same on my desktop computer. It is specific to this particular distribution, Rocky Linux, and I can't get my head around why!
Can you avoid the error if you remove a specific type of pre-processing that you have (need to see if there is a specific computation that is generating the eroror)
I did a few tests and couldn't find any preprocessing which caused the error.
Can you see what part of the full import is generating the conflict? You can comment out the sub full imports in the full import script to see exactly what is interfering
That was a pretty funny game, but after trying all of them one-by-one, whenever any of the subs below is imported causes the problem.
from .postprocessing import *
from .qualitymetrics import *
from .curation import *
from .comparison import *
from .widgets import *
from .exporters import *
Surprisingly, none of them are about preprocessing!
Does the error show up with python 3.10?
Yes. I have tried
dnf
python3.11dnf
python3.9Could you try using a different mp_context (spawn) to see if the error stil shows up:
If I add mp_context="spawn"
to the last save
call, the error is still there, but something else appeared in the error message.
Code:
recording_saved = recording_cmr.save(n_jobs=-1, folder='preprocessed',mp_context="spawn")
Error:
write_binary_recording: 0%| | 0/9 [00:00<?, ?it/s]
write_binary_recording: 11%|#1 | 1/9 [00:00<00:05, 1.60it/s]
write_binary_recording: 22%|##2 | 2/9 [00:01<00:04, 1.53it/s]
write_binary_recording: 33%|###3 | 3/9 [00:01<00:03, 1.51it/s]
write_binary_recording: 44%|####4 | 4/9 [00:02<00:03, 1.53it/s]
write_binary_recording: 56%|#####5 | 5/9 [00:03<00:02, 1.55it/s]
write_binary_recording: 67%|######6 | 6/9 [00:03<00:01, 1.53it/s]
write_binary_recording: 78%|#######7 | 7/9 [00:04<00:01, 1.52it/s]
write_binary_recording: 89%|########8 | 8/9 [00:05<00:00, 1.51it/s]
write_binary_recording: 100%|##########| 9/9 [00:05<00:00, 1.64it/s]
write_binary_recording: 100%|##########| 9/9 [00:05<00:00, 1.57it/s]
write_binary_recording: 0%| | 0/9 [00:00<?, ?it/s]
write_binary_recording: 0%| | 0/9 [00:00<?, ?it/s]
write_binary_recording: 11%|#1 | 1/9 [00:00<00:04, 1.76it/s]
write_binary_recording: 0%| | 0/9 [00:00<?, ?it/s]Traceback (most recent call last):
File "<string>", line 1, in <module>
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib64/python3.11/multiprocessing/spawn.py", line 120, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/usr/lib64/python3.11/multiprocessing/spawn.py", line 120, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/multiprocessing/spawn.py", line 129, in _main
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/multiprocessing/spawn.py", line 129, in _main
prepare(preparation_data)
File "/usr/lib64/python3.11/multiprocessing/spawn.py", line 240, in prepare
prepare(preparation_data)
File "/usr/lib64/python3.11/multiprocessing/spawn.py", line 240, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/usr/lib64/python3.11/multiprocessing/spawn.py", line 291, in _fixup_main_from_path
_fixup_main_from_path(data['init_main_from_path'])
File "/usr/lib64/python3.11/multiprocessing/spawn.py", line 291, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
^^^^^^^^^^^^^^^^^^^^^^^^^
main_content = runpy.run_path(main_path,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen runpy>", line 291, in run_path
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 291, in run_path
File "<frozen runpy>", line 98, in _run_module_code
File "<frozen runpy>", line 88, in _run_code
File "/mnt/test-spikeinterface-orig.py", line 19, in <module>
File "<frozen runpy>", line 88, in _run_code
File "/mnt/test-spikeinterface-orig.py", line 19, in <module>
binary_recording = large_recording.save(folder='base')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/rocky/.local/lib/python3.11/site-packages/spikeinterface/core/base.py", line 749, in save
binary_recording = large_recording.save(folder='base')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
loaded_extractor = self.save_to_folder(**kwargs)
File "/home/rocky/.local/lib/python3.11/site-packages/spikeinterface/core/base.py", line 749, in save
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/rocky/.local/lib/python3.11/site-packages/spikeinterface/core/base.py", line 812, in save_to_folder
loaded_extractor = self.save_to_folder(**kwargs)
assert not folder.exists(), f"folder {folder} already exists, choose another name"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: folder base already exists, choose another name
File "/home/rocky/.local/lib/python3.11/site-packages/spikeinterface/core/base.py", line 812, in save_to_folder
assert not folder.exists(), f"folder {folder} already exists, choose another name"
AssertionError: folder base already exists, choose another name
write_binary_recording: 0%| | 0/9 [00:02<?, ?it/s]
Traceback (most recent call last):
File "/mnt/test-spikeinterface-orig.py", line 37, in <module>
recording_saved = recording_cmr.save(n_jobs=-1, folder='preprocessed',mp_context="spawn")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/rocky/.local/lib/python3.11/site-packages/spikeinterface/core/base.py", line 749, in save
loaded_extractor = self.save_to_folder(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/rocky/.local/lib/python3.11/site-packages/spikeinterface/core/base.py", line 825, in save_to_folder
cached = self._save(folder=folder, verbose=verbose, **save_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/rocky/.local/lib/python3.11/site-packages/spikeinterface/core/baserecording.py", line 444, in _save
write_binary_recording(self, file_paths=file_paths, dtype=dtype, **job_kwargs)
File "/home/rocky/.local/lib/python3.11/site-packages/spikeinterface/core/core_tools.py", line 314, in write_binary_recording
executor.run()
File "/home/rocky/.local/lib/python3.11/site-packages/spikeinterface/core/job_tools.py", line 400, in run
for res in results:
File "/home/rocky/.local/lib/python3.11/site-packages/tqdm/std.py", line 1182, in __iter__
for obj in iterable:
File "/usr/lib64/python3.11/concurrent/futures/process.py", line 597, in _chain_from_iterable_of_lists
for element in iterable:
File "/usr/lib64/python3.11/concurrent/futures/_base.py", line 619, in result_iterator
yield _result_or_cancel(fs.pop())
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/concurrent/futures/_base.py", line 317, in _result_or_cancel
return fut.result(timeout)
^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/concurrent/futures/_base.py", line 456, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
==== UPDATE ====
I left only from .postprocessing import *
in spikeinterface.full
and tried to figure out what could cause the problem. Import of any sub-sub modules below caused the error.
from .correlograms import (
CorrelogramsCalculator,
compute_autocorrelogram_from_spiketrain,
compute_crosscorrelogram_from_spiketrain,
compute_correlograms,
correlogram_for_one_segment,
compute_correlograms_numba,
compute_correlograms_numpy,
)
from .isi import (
ISIHistogramsCalculator,
compute_isi_histograms_from_spiketrain,
compute_isi_histograms,
compute_isi_histograms_numpy,
compute_isi_histograms_numba,
)
from .unit_localization import (
compute_unit_locations,
UnitLocationsCalculator,
compute_center_of_mass,
)
Apparently, all of them import numba
, so if I change the test script such as it imports numba
- this creates the problem in preprocessing
!
#import spikeinterface.full as si
# si.set_global_tmp_folder("spikeiteface.cache")
# from spikeinterface.core import WaveformExtractor, BaseWaveformExtractorExtension
import numba
I checked that all numba
libraries have correct links.
$ for l in $(find . -name "*.so") ; do echo $l ; ldd $l ; echo ; done
./_devicearray.cpython-311-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffff235f000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fad19aed000)
libm.so.6 => /lib64/libm.so.6 (0x00007fad1976b000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fad19553000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fad19333000)
libc.so.6 => /lib64/libc.so.6 (0x00007fad18f6e000)
/lib64/ld-linux-x86-64.so.2 (0x00007fad19e82000)
./_dispatcher.cpython-311-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffee856c000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fac88066000)
libm.so.6 => /lib64/libm.so.6 (0x00007fac87ce4000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fac87acc000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fac878ac000)
libc.so.6 => /lib64/libc.so.6 (0x00007fac874e7000)
/lib64/ld-linux-x86-64.so.2 (0x00007fac883fb000)
./_dynfunc.cpython-311-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffe2bbac000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f43fdd00000)
libc.so.6 => /lib64/libc.so.6 (0x00007f43fd93b000)
/lib64/ld-linux-x86-64.so.2 (0x00007f43fdf20000)
./_helperlib.cpython-311-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffc553f1000)
libm.so.6 => /lib64/libm.so.6 (0x00007f48d12dc000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f48d10bc000)
libc.so.6 => /lib64/libc.so.6 (0x00007f48d0cf7000)
/lib64/ld-linux-x86-64.so.2 (0x00007f48d165e000)
./mviewbuf.cpython-311-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffd86bab000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f5920a96000)
libc.so.6 => /lib64/libc.so.6 (0x00007f59206d1000)
/lib64/ld-linux-x86-64.so.2 (0x00007f5920cb6000)
./core/runtime/_nrt_python.cpython-311-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffebd7af000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f6828bb4000)
libm.so.6 => /lib64/libm.so.6 (0x00007f6828832000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f682861a000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f68283fa000)
libc.so.6 => /lib64/libc.so.6 (0x00007f6828035000)
/lib64/ld-linux-x86-64.so.2 (0x00007f6828f49000)
./core/typeconv/_typeconv.cpython-311-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffc529fb000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f4a09258000)
libm.so.6 => /lib64/libm.so.6 (0x00007f4a08ed6000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f4a08cbe000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f4a08a9e000)
libc.so.6 => /lib64/libc.so.6 (0x00007f4a086d9000)
/lib64/ld-linux-x86-64.so.2 (0x00007f4a095ed000)
./cuda/cudadrv/_extras.cpython-311-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffd421fc000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fb05e3d6000)
libc.so.6 => /lib64/libc.so.6 (0x00007fb05e011000)
/lib64/ld-linux-x86-64.so.2 (0x00007fb05e5f6000)
./experimental/jitclass/_box.cpython-311-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffeb9e49000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f5a82720000)
libc.so.6 => /lib64/libc.so.6 (0x00007f5a8235b000)
/lib64/ld-linux-x86-64.so.2 (0x00007f5a82940000)
./np/ufunc/_internal.cpython-311-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffecf18e000)
libm.so.6 => /lib64/libm.so.6 (0x00007fc7003d6000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fc7001b6000)
libc.so.6 => /lib64/libc.so.6 (0x00007fc6ffdf1000)
/lib64/ld-linux-x86-64.so.2 (0x00007fc700758000)
./np/ufunc/_num_threads.cpython-311-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007fffde9d9000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f94ab491000)
libc.so.6 => /lib64/libc.so.6 (0x00007f94ab0cc000)
/lib64/ld-linux-x86-64.so.2 (0x00007f94ab6b1000)
./np/ufunc/omppool.cpython-311-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffc97df8000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f70d3fc9000)
libm.so.6 => /lib64/libm.so.6 (0x00007f70d3c47000)
libgomp.so.1.0.0 => /lib64/libgomp.so.1.0.0 (0x00007f70d3a0f000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f70d37f7000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f70d35d7000)
libc.so.6 => /lib64/libc.so.6 (0x00007f70d3212000)
/lib64/ld-linux-x86-64.so.2 (0x00007f70d435e000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f70d300e000)
./np/ufunc/tbbpool.cpython-311-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffcb4bc1000)
libtbb.so.12 => not found
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f1f08463000)
libm.so.6 => /lib64/libm.so.6 (0x00007f1f080e1000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f1f07ec9000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f1f07ca9000)
libc.so.6 => /lib64/libc.so.6 (0x00007f1f078e4000)
/lib64/ld-linux-x86-64.so.2 (0x00007f1f087f8000)
./np/ufunc/workqueue.cpython-311-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffca77e8000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007ff110060000)
libm.so.6 => /lib64/libm.so.6 (0x00007ff10fcde000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007ff10fac6000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ff10f8a6000)
libc.so.6 => /lib64/libc.so.6 (0x00007ff10f4e1000)
/lib64/ld-linux-x86-64.so.2 (0x00007ff1103f5000)
and libtbb
is missing. However, on my desktop computer where everything works, it's missing too. So what else can it be?!
The problem above was solved by compiling and installing locally libffi-3.4.4
then recompiling python-3.10.12
from source. spikeinterface
installed after that works as expected.
@rat-h : your fight was very honorable! Sorry for you that to spent so much time on installtion problem.
I do not like to much anaconda on linux I prefer system package + pip + venv. But sometimes it helps to have a clean and easy install with conda.
The version alignement of numba+numpy and numpy+hdbscan is sometimes very hard on linux.
Be aware that we are trying to maintain almost working anaconda environement here: https://github.com/SpikeInterface/spikeinterface/tree/main/installation_tips There are not always uptodate but this is a good start.
Amazing that you solved it!
I still wonder how did the linkage failed as it only does with that specific architecture in your virtual machine right?
@h-mayorquin you are correct. Apparently, Rocky Linux uses lightweight libraries to make cloud images smaller. At first, I thought that the problem in lib libtbb
which is present but isn't seen by python and nuba
. However, it was not the source of the problem. So after that, I just had to find which libraries cause it.
I am wondering, thinking restrospectively, is there anything that we could have done at the spikeinterface level to have make this type of error easier to debug?
Any suggestion?
Hm, it's hard to say. Maybe if we had an option to run internal tests for each required module before installation and report any errors - this can narrow the search for these kinds of errors.
We have CI testing for each modules which is maybe something that could be use to diagnose:
Maybe we could make this more prominent somewhere else in the readme like "to test your installation" section or something like that.
I am happy that could pin down your problem. I am closing this issue now.
Running
recording.save(folder=file_path_pre,n_jobs=2,format='binary')
results in the error trace below. Anything aboven_jobs=1
causes this error, so it must have something to do with parallel processing. It seems that my full python script is rerun (i.e., everything I print to terminal before calling.save
is printed again). This causes .save to create the output folder multiple times, resulting in the error below.