SpikeInterface / spikeinterface

A Python-based module for creating flexible and robust spike sorting pipelines.
https://spikeinterface.readthedocs.io
MIT License
521 stars 186 forks source link

Kilosort4 Support on MacOS in Docker #2746

Open tabedzki opened 6 months ago

tabedzki commented 6 months ago

Feature you'd like to see:

I would like support for the MacOS system. I am using the M series chip to develop for our lab and test new tools locally before deploying to our server. The instructions call for the use of the CUDA python package however Macs do not support use NVIDIA hardware, making it difficult or unintuitive to test the package locally.

Additional Context

I reached out to the Kilosort4 maintainer and he responded that this is a SI issue: https://github.com/MouseLand/Kilosort/issues/674.

When I use the Kilosort4 package through the SpikeInterface software, I end up with the following error:

(dj3.9) ➜  Official_Tutorial_SI_0.99_Nov23 git:(master) ✗ pip install cuda
ERROR: Could not find a version that satisfies the requirement cuda (from versions: none)
ERROR: No matching distribution found for cuda
(dj3.9) ➜  Official_Tutorial_SI_0.99_Nov23 git:(master) ✗ pip install cuda-python
ERROR: Could not find a version that satisfies the requirement cuda-python (from versions: none)
ERROR: No matching distribution found for cuda-python

When going through spikeinterface:


sorter_params = si.get_default_sorter_params('kilosort4')
print(sorter_params)

# run spike sorting on entire recording

sorting_MS4 = si.run_sorter(sorter_name='kilosort4', recording=recording_saved, remove_existing_folder=True,
                             output_folder=base_folder / 'results_KS4',
                             verbose=True, **sorter_params, docker_image=True)

{
    "name": "Exception",
    "message": "This sorter requires cuda, but the package 'cuda-python' is not installed. You can install it with:
pip install cuda-python",
    "stack": "---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
File /opt/anaconda3/envs/dj3.9/lib/python3.9/site-packages/spikeinterface/sorters/utils/misc.py:70, in has_nvidia()
     69 try:
---> 70     from cuda import cuda
     71 except ModuleNotFoundError as err:

ModuleNotFoundError: No module named 'cuda'

The above exception was the direct cause of the following exception:

Exception                                 Traceback (most recent call last)
Cell In[56], line 3
      1 # run spike sorting on entire recording
----> 3 sorting_MS4 = si.run_sorter(sorter_name='kilosort4', recording=recording_saved, remove_existing_folder=True,
      4                              output_folder=base_folder / 'results_KS4',
      5                              verbose=True, **sorter_params, docker_image=True)

File /opt/anaconda3/envs/dj3.9/lib/python3.9/site-packages/spikeinterface/sorters/runsorter.py:169, in run_sorter(sorter_name, recording, output_folder, remove_existing_folder, delete_output_folder, verbose, raise_error, docker_image, singularity_image, delete_container_files, with_output, **sorter_params)
    167         else:
    168             container_image = singularity_image
--> 169     return run_sorter_container(
    170         container_image=container_image,
    171         mode=mode,
    172         **common_kwargs,
    173     )
    175 return run_sorter_local(**common_kwargs)

File /opt/anaconda3/envs/dj3.9/lib/python3.9/site-packages/spikeinterface/sorters/runsorter.py:438, in run_sorter_container(sorter_name, recording, mode, container_image, output_folder, remove_existing_folder, delete_output_folder, verbose, raise_error, with_output, delete_container_files, extra_requirements, installation_mode, spikeinterface_version, spikeinterface_folder_source, **sorter_params)
    436     extra_kwargs[\"container_requires_gpu\"] = True
    437 elif gpu_capability == \"nvidia-optional\":
--> 438     if has_nvidia():
    439         extra_kwargs[\"container_requires_gpu\"] = True
    440     else:

File /opt/anaconda3/envs/dj3.9/lib/python3.9/site-packages/spikeinterface/sorters/utils/misc.py:72, in has_nvidia()
     70     from cuda import cuda
     71 except ModuleNotFoundError as err:
---> 72     raise Exception(
     73         \"This sorter requires cuda, but the package 'cuda-python' is not installed. You can install it with:\
pip install cuda-python\"
     74     ) from err
     76 try:
     77     (cu_result_init,) = cuda.cuInit(0)

Exception: This sorter requires cuda, but the package 'cuda-python' is not installed. You can install it with:
pip install cuda-python"
}
tabedzki commented 6 months ago

The Kilosort4 algorithm itself supports using the CPU itself if the GPU isn't available from what I've seen of the KS documentation, though one might ave to actively choose to use the CPU.

zm711 commented 6 months ago

Are your datasets small enough that you could run it on the CPU? For full scale sorting CPU is not a great option so as was mentioned in the original issue for any real data you should use a computer/server with a nvidia card.

zm711 commented 6 months ago

Based on Sam's comment in that chat, it definitely looks like you could try to run this locally. The cuda check only occurs in a container. But if you tried to run this locally instead (docker_image=False), we can see if this works.

tabedzki commented 6 months ago

I had a kernel crash yesterday when I didn't use the docker image from what I remember. I can test this tomorrow to see if it is still an issue.

zm711 commented 6 months ago

If the kernel is crashing locally then it might be an issue in 1) Kilosort or 2) pytorch for macOS. Docker is definitely a check in our codebase so we would need to change that check for docker + pytorch to work for macOS.

tabedzki commented 6 months ago

@zm711

I tried it again without the docker image and it crashed. I'm happy to share with you the notebook. Here's the log dump

14:47:55.767 [info] Cell 101 completed in 0.012s (start: 1714070875754, end: 1714070875766)
14:48:00.815 [info] Handle Execution of Cells 101 for ~/code/spiketutorials/Official_Tutorial_SI_0.99_Nov23/SpikeInterface_Tutorial.ipynb
14:48:00.956 [error] Disposing session as kernel process died ExitCode: undefined, Reason: OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.

14:48:00.969 [info] Cell 101 completed in -1714070880.831s (start: 1714070880831, end: undefined)
zm711 commented 6 months ago

We might need to see the script/notebook. This should use cpu in this case. Which version of spikeinterface are you using?

did you explicitly set this just in case:

https://github.com/SpikeInterface/spikeinterface/blob/163ab282645e62fbfd75cd845eb7755da187c787/src/spikeinterface/sorters/external/kilosort4.py#L54

tabedzki commented 6 months ago

I did not explicitly set that. Where should I be set that/pass it in?

zm711 commented 6 months ago
sorting = ss.run_sorter('kilosort4', recording, torch_device='cpu')

That way we ensure that the device is set to cpu for everything.

tabedzki commented 6 months ago

When I pass that argument in, I get Bad parameters: ['torch_device']

Here is the gist for the notebook that I ran (can't upload in this thread directly) https://gist.github.com/tabedzki/4352df1e71d12c2a4cc5df3c94888023

zm711 commented 6 months ago

That feature was added to main but hadn't been backported to the 100 series yet. So that's why that parameter didn't work. Running locally should automatically trigger cpu. If you do

import torch
torch.cuda.is_available()

It prints False right?

tabedzki commented 6 months ago

Correct, it prints False

zm711 commented 6 months ago

Last question would be does Kilosort4 work by itself without using the spikeinterface wrapper. Because if it also doesn't work then the issue would be over on their side.

tabedzki commented 6 months ago

Kilosort4 works without the spikeinterface. I ran the tutorial on the kilosort website and it correctly reports Using CPU for PyTorch computations.

from kilosort import run_kilosort

# NOTE: 'n_chan_bin' is a required setting, and should reflect the total number
#       of channels in the binary file. For information on other available
#       settings, see `kilosort.run_kilosort.default_settings`.
settings = {'data_dir': SAVE_PATH.parent, 'n_chan_bin': 385}

ops, st, clu, tF, Wall, similar_templates, is_ref, est_contam_rate = \
    run_kilosort(settings=settings, probe_name='neuropixPhase3B1_kilosortChanMap.mat')

Cell output:

Interpreting binary file as default dtype='int16'. If data was saved in a different format, specify `data_dtype`.
Using CPU for PyTorch computations. Specify `device` to change this.
sorting /Users/tabedzki/ZFM-02370_mini.imec0.ap.bin
using probe neuropixPhase3B1_kilosortChanMap.mat
/opt/anaconda3/envs/dj3.9/lib/python3.9/site-packages/kilosort/io.py:497: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/utils/tensor_numpy.cpp:212.)
  X[:, self.nt : self.nt+nsamp] = torch.from_numpy(data).to(self.device).float()
Preprocessing filters computed in  1.71s; total  1.71s

computing drift
Re-computing universal templates from data.
 11%|█         | 5/45 [00:53<07:02, 10.55s/it]
zm711 commented 6 months ago

I use a mac as my personal laptop I can try to troubleshoot this during the weekend.--for SI wrapper, I mean.

tabedzki commented 6 months ago

Sounds good, thanks!

tabedzki commented 6 months ago

As a related but different issue,

I have copied the spiketutorial notebooks over to one of the linux clusters and ran the same tutorial Nov23 and tried using the kilosort4 locally without the docker image. I received a new error, different from the previous one. I went back and tried the kilosort 4 tutorial that is available on their website and that ran without any trouble. I can upload a gist of the notebook. The only modifications I made (aside from installing the packages required, including mountainsort and kilosort) was to the folder path of the data and the type of sorter being used locally. Any help on this would be greatly appreciated. Please let me know what other information I should provide that would help.

# sorter_params = {'do_correction': False}
import json

# run spike sorting on entire recording
sorter_params = si.get_default_sorter_params('kilosort4')
print(sorter_params)

# Convert the data to a JSON formatted string with 4 spaces of indentation
json_str = json.dumps(sorter_params, indent=4)

# Print the pretty-printed JSON string
print(json_str)

sorting_KS4_no_docker = si.run_sorter(sorter_name='kilosort4', recording=recording_saved, remove_existing_folder=True,
                             output_folder=base_folder / 'results_KS4_no_docker',
                             verbose=True, **sorter_params, docker_image=False)
{'batch_size': 60000, 'nblocks': 1, 'Th_universal': 9, 'Th_learned': 8, 'do_CAR': True, 'invert_sign': False, 'nt': 61, 'artifact_threshold': None, 'nskip': 25, 'whitening_range': 32, 'binning_depth': 5, 'sig_interp': 20, 'nt0min': None, 'dmin': None, 'dminx': None, 'min_template_size': 10, 'template_sizes': 5, 'nearest_chans': 10, 'nearest_templates': 100, 'templates_from_data': True, 'n_templates': 6, 'n_pcs': 6, 'Th_single_ch': 6, 'acg_threshold': 0.2, 'ccg_threshold': 0.25, 'cluster_downsampling': 20, 'cluster_pcs': 64, 'duplicate_spike_bins': 15, 'do_correction': True, 'keep_good_only': False, 'save_extra_kwargs': False, 'skip_kilosort_preprocessing': False, 'scaleproc': None}
{
    "batch_size": 60000,
    "nblocks": 1,
    "Th_universal": 9,
    "Th_learned": 8,
    "do_CAR": true,
    "invert_sign": false,
    "nt": 61,
    "artifact_threshold": null,
    "nskip": 25,
    "whitening_range": 32,
    "binning_depth": 5,
    "sig_interp": 20,
    "nt0min": null,
    "dmin": null,
    "dminx": null,
    "min_template_size": 10,
    "template_sizes": 5,
    "nearest_chans": 10,
    "nearest_templates": 100,
    "templates_from_data": true,
    "n_templates": 6,
    "n_pcs": 6,
    "Th_single_ch": 6,
    "acg_threshold": 0.2,
    "ccg_threshold": 0.25,
    "cluster_downsampling": 20,
    "cluster_pcs": 64,
    "duplicate_spike_bins": 15,
    "do_correction": true,
    "keep_good_only": false,
    "save_extra_kwargs": false,
    "skip_kilosort_preprocessing": false,
    "scaleproc": null
}
========================================
Loading recording with SpikeInterface...
number of samples: 9000000
number of channels: 49
numbef of segments: 1
sampling rate: 30000.0
dtype: int16
========================================
Preprocessing filters computed in  1.86s; total  1.86s

computing drift
Re-computing universal templates from data.
Error running kilosort4

The Error

{
    "name": "SpikeSortingError",
    "message": "Spike sorting error trace:
Traceback (most recent call last):
  File \"/home/tabedzki/miniconda3/envs/si_env/lib/python3.9/site-packages/spikeinterface/sorters/basesorter.py\", line 258, in run_from_folder
    SorterClass._run_from_folder(sorter_output_folder, sorter_params, verbose)
  File \"/home/tabedzki/miniconda3/envs/si_env/lib/python3.9/site-packages/spikeinterface/sorters/external/kilosort4.py\", line 227, in _run_from_folder
    ops, bfile, st0 = compute_drift_correction(
  File \"/home/tabedzki/miniconda3/envs/si_env/lib/python3.9/site-packages/kilosort/run_kilosort.py\", line 350, in compute_drift_correction
    ops, st = datashift.run(ops, bfile, device=device, progress_bar=progress_bar)
  File \"/home/tabedzki/miniconda3/envs/si_env/lib/python3.9/site-packages/kilosort/datashift.py\", line 192, in run
    st, _, ops  = spikedetect.run(ops, bfile, device=device, progress_bar=progress_bar)
  File \"/home/tabedzki/miniconda3/envs/si_env/lib/python3.9/site-packages/kilosort/spikedetect.py\", line 198, in run
    ops = template_centers(ops)
  File \"/home/tabedzki/miniconda3/envs/si_env/lib/python3.9/site-packages/kilosort/spikedetect.py\", line 98, in template_centers
    nx = np.round((xmax - xmin) / (dminx/2)) + 1
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'

Spike sorting failed. You can inspect the runtime trace in /home/tabedzki/code/spiketutorials/Official_Tutorial_SI_0.99_Nov23/results_KS4_no_docker/spikeinterface_log.json.",
    "stack": "---------------------------------------------------------------------------
SpikeSortingError                         Traceback (most recent call last)
Cell In[50], line 15
     11 # Print the pretty-printed JSON string
     12 print(json_str)
---> 15 sorting_KS4_no_docker = si.run_sorter(sorter_name='kilosort4', recording=recording_saved, remove_existing_folder=True,
     16                              output_folder=base_folder / 'results_KS4_no_docker',
     17                              verbose=True, **sorter_params, docker_image=False)

File ~/miniconda3/envs/si_env/lib/python3.9/site-packages/spikeinterface/sorters/runsorter.py:175, in run_sorter(sorter_name, recording, output_folder, remove_existing_folder, delete_output_folder, verbose, raise_error, docker_image, singularity_image, delete_container_files, with_output, **sorter_params)
    168             container_image = singularity_image
    169     return run_sorter_container(
    170         container_image=container_image,
    171         mode=mode,
    172         **common_kwargs,
    173     )
--> 175 return run_sorter_local(**common_kwargs)

File ~/miniconda3/envs/si_env/lib/python3.9/site-packages/spikeinterface/sorters/runsorter.py:225, in run_sorter_local(sorter_name, recording, output_folder, remove_existing_folder, delete_output_folder, verbose, raise_error, with_output, **sorter_params)
    223 SorterClass.set_params_to_folder(recording, output_folder, sorter_params, verbose)
    224 SorterClass.setup_recording(recording, output_folder, verbose=verbose)
--> 225 SorterClass.run_from_folder(output_folder, raise_error, verbose)
    226 if with_output:
    227     sorting = SorterClass.get_result_from_folder(output_folder, register_recording=True, sorting_info=True)

File ~/miniconda3/envs/si_env/lib/python3.9/site-packages/spikeinterface/sorters/basesorter.py:293, in BaseSorter.run_from_folder(cls, output_folder, raise_error, verbose)
    290         print(f\"{sorter_name} run time {run_time:0.2f}s\")
    292 if has_error and raise_error:
--> 293     raise SpikeSortingError(
    294         f\"Spike sorting error trace:\
{log['error_trace']}\
\"
    295         f\"Spike sorting failed. You can inspect the runtime trace in {output_folder}/spikeinterface_log.json.\"
    296     )
    298 return run_time

SpikeSortingError: Spike sorting error trace:
Traceback (most recent call last):
  File \"/home/tabedzki/miniconda3/envs/si_env/lib/python3.9/site-packages/spikeinterface/sorters/basesorter.py\", line 258, in run_from_folder
    SorterClass._run_from_folder(sorter_output_folder, sorter_params, verbose)
  File \"/home/tabedzki/miniconda3/envs/si_env/lib/python3.9/site-packages/spikeinterface/sorters/external/kilosort4.py\", line 227, in _run_from_folder
    ops, bfile, st0 = compute_drift_correction(
  File \"/home/tabedzki/miniconda3/envs/si_env/lib/python3.9/site-packages/kilosort/run_kilosort.py\", line 350, in compute_drift_correction
    ops, st = datashift.run(ops, bfile, device=device, progress_bar=progress_bar)
  File \"/home/tabedzki/miniconda3/envs/si_env/lib/python3.9/site-packages/kilosort/datashift.py\", line 192, in run
    st, _, ops  = spikedetect.run(ops, bfile, device=device, progress_bar=progress_bar)
  File \"/home/tabedzki/miniconda3/envs/si_env/lib/python3.9/site-packages/kilosort/spikedetect.py\", line 198, in run
    ops = template_centers(ops)
  File \"/home/tabedzki/miniconda3/envs/si_env/lib/python3.9/site-packages/kilosort/spikedetect.py\", line 98, in template_centers
    nx = np.round((xmax - xmin) / (dminx/2)) + 1
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'

Spike sorting failed. You can inspect the runtime trace in /home/tabedzki/code/spiketutorials/Official_Tutorial_SI_0.99_Nov23/results_KS4_no_docker/spikeinterface_log.json."
}
zm711 commented 6 months ago

That's a Kilosort4 error that they changed the argument. We have a fix but we haven't released it yet. It will be in 0.100.6. You can get get past it by explicitly setting dminx (their new default is 32).

zm711 commented 6 months ago

And Kilosort 4.0.5 we don't have a patch for at all yet so that won't work at all (but it seems like you are working on 4.0.4). So doing the explicit dminx should fix that error.

zm711 commented 6 months ago

@tabedzki

I was unable to reproduce the error on macOS.

To test I used spikeinterface to generate a simulated recording:

>>> rec, sorting = si.generate_ground_truth_recording(num_channels=64, sampling_frequency=30_000.0)
>>> rec
InjectTemplatesRecording: 64 channels - 30.0kHz - 1 segments - 300,000 samples - 10.00s 
                          float32 dtype - 73.24 MiB
>>> sorting_ks = si.run_sorter('kilosort4', rec, './test', dminx=32)
========================================
Loading recording with SpikeInterface...
number of samples: 300000
number of channels: 64
numbef of segments: 1
sampling rate: 30000.0
dtype: float32
========================================
Preprocessing filters computed in  0.36s; total  0.36s

computing drift
Re-computing universal templates from data.
/Users/zacharymckenzie/opt/anaconda3/envs/kilosort_test/lib/python3.10/site-packages/threadpoolctl.py:1223: RuntimeWarning: 
Found Intel OpenMP ('libiomp') and LLVM OpenMP ('libomp') loaded at
the same time. Both libraries are known to be incompatible and this
can cause random crashes or deadlocks on Linux when loaded in the
same Python program.
Using threadpoolctl may cause crashes or deadlocks. For more
information and possible workarounds, please see
    https://github.com/joblib/threadpoolctl/blob/master/multiple_openmp.md

  warnings.warn(msg, RuntimeWarning)
100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:06<00:00,  1.34s/it]
drift computed in  8.19s; total  8.55s

Extracting spikes using templates
Re-computing universal templates from data.
100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:06<00:00,  1.31s/it]
1078 spikes extracted in  7.91s; total  16.45s

First clustering
100%|████████████████████████████████████████████████████████████████████████████████| 16/16 [00:00<00:00, 1321.77it/s]
10 clusters found, in  0.03s; total  16.48s

Extracting spikes using cluster waveforms
100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:02<00:00,  2.48it/s]
1232 spikes extracted in  2.06s; total  18.54s

Final clustering
100%|████████████████████████████████████████████████████████████████████████████████| 16/16 [00:00<00:00, 3467.08it/s]
6 clusters found, in  0.01s; total  18.55s

Merging clusters
6 units found, in  1.46s; total  20.01s

Saving to phy and computing refractory periods
4 units found with good refractory periods

Total runtime: 20.03s = 00:00:20 h:m:s

>>> import platform
>>> platform.system()
'Darwin'
>>> si.__version__
'0.101.0'

Kilosort version I did was 4.0.4 (since we don't have a patch for 4.0.5 yet). I'm using an M1 chip on this computer with the latest macOS. I'm wondering if the problem for you might be Jupyter to macOS communication.... I did mine straight through a python repl for this test.

zm711 commented 6 months ago

My other hypothesis is that you exhausted your RAM. I made a tiny simulated dataset (~73.MiB), but for a real dataset of GBs maybe you don't have enough memory on your computer. Could you try running KS4 through the SI wrapper and use the ActivityMonitor to see your memory usage?

tabedzki commented 6 months ago

@zm711 thank you for getting back and offering suggestions. However, I am unable to even get your simple version working. Please see below for the more information. Thank you in advance for any help.

This was tested on Python 3.9, 3.10, 3.11.

Package versions:

conda list
spikeinterface            0.100.6                  pypi_0    pypi
kilosort                  4.0.4                    pypi_0    pypi

Trying your commands directly from the terminal I get the following error:

(si_env) ➜  testing-spikeinterface ipython
Python 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:34:54) [Clang 16.0.6 ]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.22.2 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import spikeinterface.full as si
OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.

In [2]: rec, sorting = si.generate_ground_truth_recording(num_channels=64, sampling_frequency=30_000.0)

In [3]: rec
Out[3]:
InjectTemplatesRecording: 64 channels - 30.0kHz - 1 segments - 300,000 samples - 10.00s
                          float32 dtype - 73.24 MiB

In [4]: sorting_ks = si.run_sorter('kilosort4', rec, './test', dminx=32)
========================================
Loading recording with SpikeInterface...
number of samples: 300000
number of channels: 64
numbef of segments: 1
sampling rate: 30000.0
dtype: float32
========================================
[1]    64156 segmentation fault  python3 -c "import IPython, sys; sys.exit(IPython.start_ipython())"
/opt/anaconda3/envs/si_env/lib/python3.11/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

This is the system right before I run command [4] and there is Screenshot 2024-05-01 at 2 01 01 PM

zm711 commented 6 months ago

Could you try just with python?

(si_env) python
>>> 

You are testing ipython. I actually tested mine with just a python repl. I'm wondering if this is an ipython+mac issue. I can test that when I'm home later today. (ie I can rerun my test on ipython rather than python).

tabedzki commented 6 months ago

Same error unfortunately.

si_env) ➜  testing-spikeinterface python --version
Python 3.9.19
(si_env) ➜  testing-spikeinterface cat sample_instructions.py
import spikeinterface.full as si
rec, sorting = si.generate_ground_truth_recording(num_channels=64, sampling_frequency=30_000.0)
rec
sorting_ks = si.run_sorter('kilosort4', rec, './test', dminx=32)
(si_env) ➜  testing-spikeinterface python sample_instructions.py
========================================
Loading recording with SpikeInterface...
number of samples: 300000
number of channels: 64
numbef of segments: 1
sampling rate: 30000.0
dtype: float32
========================================
[1]    61624 segmentation fault  python sample_instructions.py
/opt/anaconda3/envs/si_env/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

Interactively:

  (si_env) ➜  testing-spikeinterface python
Python 3.9.19 | packaged by conda-forge | (main, Mar 20 2024, 12:55:20)
[Clang 16.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import spikeinterface.full as si
>>> rec, sorting = si.generate_ground_truth_recording(num_channels=64, sampling_frequency=30_000.0)
>>> rec
InjectTemplatesRecording: 64 channels - 30.0kHz - 1 segments - 300,000 samples - 10.00s
                          float32 dtype - 73.24 MiB
>>> sorting_ks = si.run_sorter('kilosort4', rec, './test', dminx=32)
========================================
Loading recording with SpikeInterface...
number of samples: 300000
number of channels: 64
numbef of segments: 1
sampling rate: 30000.0
dtype: float32
========================================
[1]    62580 segmentation fault  python
/opt/anaconda3/envs/si_env/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
zm711 commented 6 months ago

Since you said natively worked would you mind trying to install spikeinterface from source (ie 0.101.0) with kilosort. So I would do

conda create -n kilotest python=3.10
conda activate kilotest
pip install kilosort==4.0.4
cd spikeinterface
pip install -e ".[full,widgets]"

If you don't know how to download source code just let me know. I always install stuff from source so maybe we have a bug in the wrapper for macs in 0.100.x that we will need to track down. so if we can test on main and see if that works then we will know it's a bug if it fails on main then I'll have to play around a bit more. Which chip do you have?

tabedzki commented 6 months ago

0.101.0 works with both python and ipython while 0.100.x did not.

(kilotest) ➜  spikeinterface git:(main) ✗ conda list | rg -e kilosort -e spikeinterface
kilosort                  4.0.4                    pypi_0    pypi
spikeinterface            0.101.0                  pypi_0    pypi

The chip is M3

zm711 commented 6 months ago

@tabedzki

Are you okay with using 0.101.0? I'm a bit busy to carefully track down the bug in 0.100.x on my Mac. I still might get to it but we have a SpikeSorting Conference at the end of May so to be honest this type of seg fault debugging likely wouldn't happen until June. And at that point we may have moved onto a release of 0.101.0 (not sure of the release date of that yet).

tabedzki commented 6 months ago

Yes, I don't see it being a problem for now! I am glad we were able to get to a working state. Thanks for your help. I'll leave this as an open ticket.

zm711 commented 6 months ago

Sounds good. If I get the time I'll try to track it down and if we end up moving fully to 101 before I have time I'll close it after the release.

tabedzki commented 6 months ago

Thanks Zach!

tabedzki commented 6 months ago

@zm711 just for if you come back to this later, the docker image requires cuda which is why I had to do the local installation rather than docker. What we have here works; I just wanted to reiterate that.

Is there a way to modify the image (after the conference) to allow for non-cuda systems?

zm711 commented 6 months ago

Totally forgot that was your original problem. Oops. Let me ping @alejoe91 back in since I don't work with the docker stuff at all. He would be better to comment on whether modifying the docker wrapper code is feasible.

alejoe91 commented 6 months ago

I can push a fix that if cuda-python fails disables the GPU, so you should be able to run KS4 in CPU mode for testing

tabedzki commented 6 months ago

That would be greatly appreciated. No immediate need as I won't get around to testing the code until Monday.

h-mayorquin commented 3 months ago

Did we ever had this push? @alejoe91