BUG: Kilosort runs quickly using GUI, but very slow using "run_kilosort.py"

MShreyasStanford commented 4 months ago

Describe the issue:

We are attempting to sort neuropixel data using the new version of kilosort, and have successfully sorted the data on our local machine using the python gui(e.g. in about 30 minutes, for a two hour session). However, when running the code via run_kilosort.py, the projected time keeps climbing shortly after beginning the script.

Reproduce the bug:

We use CUDA 11.3.r11.3, due to a dependency for other code. All python, numpy, and scipy versions are up to date. We verified that pytorch is using GPU. Attached is an example log file.

kilosort4.log

Error message:

No response

Version information:

CUDA 11.3.r11.3, Python 3.10.4, Kilosort 4, Windows 10, Pytorch version 1.12.1+cu113.

jacobpennington commented 4 months ago

@MShreyasStanford Can you please upload or paste in the script you're using to run it?

calvinleng97 commented 4 months ago

@MShreyasStanford Can you please upload or paste in the script you're using to run it?

Hey! I'm a collaborator of @MShreyasStanford. Here is the relevant portion of our script:

import sys
from pathlib import Path
from crop_methods import k_most_active_channels, crop_kilosort_output
import kilosort
import numpy as np
import torch
import subprocess
import time
import os
import ctypes

DRIVE_PATH = Path("C:/")
BASE_PATH = Path(DRIVE_PATH, "SGL_DATA", "05_31")
BIN_DIR = Path(BASE_PATH, "imec_raw")
KS_OUTPUT_DIR = Path(BIN_DIR, "kilosort4")
CROPPED_OUTPUT_DIR = Path(BASE_PATH, "oss_training")

BIN_FILE = "240531_g0_t0.imec0.ap.bin"
BIN_META_FILE = "240531_g0_t0.imec0.ap.meta"
CHANNEL_MAP_FILE = "neuropixels_NHP_channel_map_dev_staggered_v1.mat"
CPP_MAIN_FILE = Path(r"C:\Users\Spike Sorter\source\repos\OnlineSpikes_v2\x64\RELEASE", "OnlineSpikes.exe")
SPIKES_OUTPUT_FILE = Path(BASE_PATH, "sorter_output", "spikeOutput.txt")

REGENERATE_KS_OUTPUT = True
K = 100

def parse_bin_meta_file(filename):
    metadata = { }
    with open(filename, 'r') as bin_meta_input:
        n_channels = -1
        for line in bin_meta_input:
            delimited = line.split('=')

            if len(delimited) != 2:
                continue

            key = delimited[0]
            value = delimited[1]

            if key == "nSavedChans":
                metadata["nSavedChans"] = int(value)

    if "nSavedChans" not in metadata:
        print("Error occurred while parsing binary metadata file for nSavedChans.")

    return metadata

# Run Kilosort4

if REGENERATE_KS_OUTPUT:
    print("Using GPU for kilosort " if torch.cuda.is_available() else "Torch not detecting GPU")
    print("Starting kilosort...")
    metadata = parse_bin_meta_file(BIN_DIR / BIN_META_FILE)
    print(metadata)
    start_time = time.time()
    ks_settings = {'data_dir': str(BIN_DIR), 'n_chan_bin': metadata["nSavedChans"]}

    kilosort.run_kilosort(
        settings=ks_settings,
        probe_name=BIN_DIR / CHANNEL_MAP_FILE,
        results_dir=str(KS_OUTPUT_DIR)
    )
    end_time = time.time()  # End timing
    duration = end_time - start_time
    print(f"Kilosort took {duration} seconds.")
else:
    print(f"Skipping Kilosort, using cropped templates in {str(CROPPED_OUTPUT_DIR)}")

# more code...

I can confirm that parse_bin_meta_file() properly detects the correct value for n_chan_bin and that Kilosort is using the GPU as evidenced in Kilosort's logs.

For our data, the progress bar shows 1106 iterations needed for completion. The expected completion time will typically climb to around 50 minutes at at around iteration 130. It will then sit at around 50 minutes until around iteration 170. Afterwards, the expected completion time will grow seemingly unboundedly and linearly. For example, the expected completion time at iteration 330 will typically be around 5 hours, with the time needed per iteration sitting at 30 seconds (also increasing roughly linearly).

jacobpennington commented 4 months ago

Thanks. I don't see anything in the script that would explain the difference, so I have a few follow-up questions:

1) Which step does that freeze happen at? It looks like the log file you uploaded stops at drift correction, was it freezing there so you cancelled the sorting?

2) Just to double check, this is the exact same binary file that you ran in the GUI and then through the API? Or is it different data from a similar experiment?

3) Would you be able to share the data, or a sample of it, so that I can try to reproduce the issue?

calvinleng97 commented 4 months ago

Apologies, I found the issue right after replying to the post.

The issue is because we were running the script on IDLE, Python's default text editor. Kilosort completes the "Computing drift correction" step in exactly 11 minutes on the command line, and is projected to finish at around the same time as when using the GUI, but was projected to essentially never finish when run in IDLE.

From my investigations, the cause of the issue may be related to how IDLE prints text (i.e., the progress bar) to the console. I suspect this because the expected computation time decreased dramatically the moment I maximized my IDLE output screen. Here are some threads that talk about similar issues:

https://stackoverflow.com/questions/2212722/python-why-is-idle-so-slow https://stackoverflow.com/questions/12851313/python-why-idle-is-slower-than-terminal https://superuser.com/questions/557056/speed-python-script-on-command-line-vs-launched-in-idle-shell

jacobpennington commented 4 months ago

Ahhh interesting, I'll look into that. Thank you!

shahafweissMPI commented 4 months ago

I get the same issue when running kilosort4 from Spyder with spikeinterface. a whole day to sort 1.5hrs of NP1.0. much faster from the GUI. but then i cannot turn off preprocessing and CAR

jacobpennington commented 4 months ago

@shahafweissMPI Please try running it from a terminal or a different IDE, Spyder likely has the same issue that IDLE does.

MouseLand / Kilosort