SpikeInterface / spikeinterface

A Python-based module for creating flexible and robust spike sorting pipelines.
https://spikeinterface.readthedocs.io
MIT License
527 stars 187 forks source link

Error in run_sorter : file spikeinterface_log.json empty #2452

Open Mymi-INS opened 9 months ago

Mymi-INS commented 9 months ago

Hi everyone,

I have an issue when running my SI pipeline. The error is JSONDecodeERROR : Expecting value : line 1 column 1 (char0). I checked and it's because the file spikeinterface_log.json is empty. I don't understand why it's empty cause every other file that SI create like spikeinterface_recording.json and _params.json are not empty. Can you help me with that please?

Thanks :)

Myriam

alejoe91 commented 9 months ago

Hi @Mymi-INS

Can you print the output of your script? Where is the error triggered?

Mymi-INS commented 9 months ago

Hi @alejoe91,

My terminal say :

File "mypath/SI_Pipeline.py", line 198, in <module>
    run_cluster_cutting(config)
  File "mypath/SI_Pipeline.py", line 160, in run_cluster_cutting
    _ = si.run_sorter("kilosort2_5",rec,output_folder=SI_output_path,verbose=True,docker_image=True,**params_kilosort2_5)
  File "path/lib/python3.10/site-packages/spikeinterface/sorters/runsorter.py", line 141, in run_sorter
    return run_sorter_container(
  File "path/lib/python3.10/site-packages/spikeinterface/sorters/runsorter.py", line 600, in run_sorter_container
    log = json.load(f)
  File "path/lib/python3.10/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "path/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "path/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "path/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

I checked what this error mean and it's because my json is empty. But I couldn't find in SI where this Json is supposed to be created to see if i can do something.

Thanks for your help :)

alejoe91 commented 9 months ago

What is the output before that? Can you also share the version of SpikeInterface that you are using?

Mymi-INS commented 9 months ago

What is written before is that :

Spike sorting launched at 2024-01-30 12:32:37.567109
mypath/SpikeInterface_output
Starting container
Installing spikeinterface==0.99.1 in spikeinterface/kilosort2_5-compiled-base
Running kilosort2_5 sorter inside spikeinterface/kilosort2_5-compiled-base
Stopping container
Traceback (most recent call last):

and I guess my SI version is 0.99.1

alejoe91 commented 9 months ago

did you notice the Stopping container happening right after the Running ... line?

Mymi-INS commented 9 months ago

I thought the "stopping container" just said that there was an error :') it's not that? (sorry I'm not that good to code)

alejoe91 commented 9 months ago

no, that means that either the spike sorter ran fine, or it failed. In any case, the container in which the spike sorting is run is stopped.

Mymi-INS commented 9 months ago

How can I know why it stopped? is it for that, that my json is empty and that my code return the error of emptyness?

Also I don't understand why the error message is with the empty json and not with the fact that my container stopped.

alejoe91 commented 9 months ago

no, it should not be empty. Can you share your entire script, and possibly the data?

Mymi-INS commented 9 months ago

I ll put the script here in 30/40min :) I hope you ll understand something cause there is a lot of stuff in there For the data is way too heavy, but it's data recorded with openephys (.dat), classical data obtained with silicon probes.

alejoe91 commented 9 months ago

Could you share it with google drive or something similar?

Mymi-INS commented 9 months ago

Hi again, here is my pipeline, I'll give you a piece of data tomorrow :) Thanks again for your help

"""\
Launch the spike sorting algorithm with SpikeInterface.

Usage: in a console: python3 SI_Pipeline.py CONFIG DATE -arg1 ARG1 ... -argn ARGn

Type python3 SI_Pipeline.py -h for help on the possible arguments.
"""

import argparse
import glob
import json
import os
from datetime import datetime
from time import time

import numpy as np
import pandas as pd
import spikeinterface.full as si

def parse_arguments() -> argparse.Namespace:
    """
    Parse CLI arguments.

    Returns:
        Namespace: CLI arguments as an argparse Namespace.
    """

    parser = argparse.ArgumentParser()

    parser.add_argument("KS config", type=str, help="The name of the config to use.")
    parser.add_argument("date", type=str, help="The date of the recording, used to find the data (eg. 20231212")
    parser.add_argument("-abc", "--additional_bad_channels", type = str, default = "", help = "Additional bad channels. E.g: '17,23,45,69'. (Added to the list specified in the config file).")
    parser.add_argument("--do_correction", type = bool, default = None, help = "Whether drift registration is applied or not.")
    parser.add_argument("-dt", "--detect_threshold", type = float, default = None, help = "Threshold for spike detection.")
    parser.add_argument("--minFR", type = float, default = None, help = "Minimal firing rate required to keep a cluster.")
    parser.add_argument("--minfr_goodchannels", type = float, default = None, help = "Minimal firing rate on a 'good' channel required to keep a cluster.")
    parser.add_argument("-j", "--jobs", type = int, default = 160, help = "Number of subprocesses to use.")

    args = parser.parse_args()

    for arg in ["jobs", "detect_threshold", "minFR", "minfr_goodchannels"]:
        value = args.__dict__[arg]
        if value is not None:
            assert value >= 0, f"{arg} argument cannot be negative ! (Current value : {value})"

    return args

def load_config(args: dict) -> dict:
    """
    Build the config dictionnary from the rat config, the day's base config, the spike sorting config, and the CLI arguments.

    Args:
        args (dict): Input CLI arguments as a Python dict.

    Returns:
        dict: The config dict for this day.
    """

    # Load the spike sorting config
    KS_config_path = os.path.abspath(os.path.join(__file__, os.path.pardir, f"yourpath/KS_config_{args['KS config']}.json"))
    assert os.path.exists(KS_config_path), f"Kilosort config file {KS_config_path} not found. Please check if it exists."
    with open(KS_config_path) as f:
        KS_config = json.load(f)

    # Load the rat config
    rat_config_path = os.path.abspath(os.path.join(__file__, f"yourpath/config_{KS_config['rat']}.json"))
    print(rat_config_path)
    assert os.path.exists(rat_config_path), f"Rat config file {rat_config_path} not found. Please check if it exists."
    with open(rat_config_path) as f:
        rat_config = json.load(f)

    # Find the data path
    data_path_prefix_template = rat_config["data_path_prefix_template"]
    data_paths = glob.glob(os.path.join(data_path_prefix_template, f"{rat_config['rat_prefix']}_{args['date']}")) + glob.glob(os.path.join(data_path_prefix_template, "Training", f"{rat_config['rat_prefix']}_{args['date']}"))
    print(data_paths)
    assert len(data_paths) == 1, "Multiple data folders found !"
    data_path = data_paths[0]

    # Load the day's base config
    base_config_path = os.path.join(data_path, f"{os.path.basename(data_path)}_config.json")
    print(base_config_path)
    assert os.path.exists(base_config_path), f"Base config file {base_config_path} not found. Please check if it exists."
    with open(base_config_path) as f:
        base_config = json.load(f)

    # Merge all the configs together (the order is important!)
    config ={**rat_config, **base_config, **KS_config} 
   # config = rat_config | base_config | KS_config

    # Merging in CLI argument values
    for arg, value in args.items():
        if value is not None and arg not in ["KS config", "date"] \
          and not (arg == "jobs" and value == 160) \
          and not (arg == "additional_bad_channels" and value == ""):
            print(f"Using custom value for parameter '{arg}' : {value}")
        if value is not None:
            config[arg] = value

    return config

def run_cluster_cutting(config: dict):
    """
    Launch the spike sorting algorithm in docker.

    Args:
        config (dict): the day's config.
    """

    data_path = config["data_path"]
    assert os.access(data_path, os.W_OK), f"Directory {data_path} is not writable."
    KS_output_path = config["sorter_output_path"]

    # Define some sorting parameters
    params_kilosort2_5 = {"n_jobs": config["jobs"],
                          "do_correction": config["do_correction"],
                          "detect_threshold": config["detect_threshold"],
                          "minFR": config["minFR"],
                          "minfr_goodchannels": config["minfr_goodchannels"],
                          "nblocks" : 0}

    # Load raw data
    dat_file_path = os.path.join(config["data_path"], f"{os.path.basename(config['data_path'])}.dat")
    recording = si.read_binary(
        dat_file_path,
        num_channels=config["num_channels"],
        sampling_frequency=config["sampling_frequency"],
        dtype=config["recording_dtype"],
        gain_to_uV=config["recording_gain_to_uV"],
    )

    # Load and set electrode positions
    positions = np.load(os.path.join(data_path, f"{os.path.basename(data_path)}_channel_positions.npy"))
    recording.set_channel_locations(positions)

    # Remove bad channels
    additional_bad_channels = [int(c) for c in config["additional_bad_channels"].split(",") if len(config["additional_bad_channels"]) > 0]
    bad_channels = set(config["bad_channels"]).union(additional_bad_channels + list(range(128,config["num_channels"])))
    recording_removed_channels = recording.remove_channels(list(bad_channels))

    # Pre-process the data
    rec = si.highpass_filter(recording_removed_channels, freq_min=config["highpass_freq"])
    rec = si.common_reference(rec, operator="median", reference="local", local_radius=(0, config["common_ref_local_radius"]))

    # Run the sorter
    t0 = time()
    print(f"Spike sorting launched at {datetime.fromtimestamp(t0)}")
    SI_output_path = os.path.abspath(os.path.join(KS_output_path, ".."))
    print(SI_output_path)
    _ = si.run_sorter("kilosort2_5",rec,output_folder=SI_output_path,verbose=True,docker_image=True,**params_kilosort2_5)

    # Change some phy parameters
    if os.access(KS_output_path, os.W_OK) and os.access(os.path.join(KS_output_path, "params.py"), os.W_OK):
        with open(os.path.join(KS_output_path, "params.py")) as params:
            filedata = params.read()

        filedata = filedata.replace("temp_wh.dat", "recording.dat")
        filedata = filedata.replace("hp_filtered = True", "hp_filtered = False")

        with open(os.path.join(KS_output_path, "params.py"), "w") as params:
            params.write(filedata)
    else:
        print(f"Sorter output folder writeable: {os.access(KS_output_path, os.W_OK)}")
        print(f"Phy params.py file writeable: {os.access(os.path.join(KS_output_path, 'params.py'), os.W_OK)}")

    # Display the running time of the script
    dt = time() - t0
    h = int(dt // 3600)
    m = int((dt - 3600 * h) // 60)
    s = int((dt - 3600 * h - 60 * m))

    print(f"Done (in {h}h{m}m{s}s), results stored in {KS_output_path}.")

    # Display some stats of the sorting
    labels = pd.read_csv(os.path.join(KS_output_path, "cluster_KSLabel.tsv"), sep="\t")
    n_good = labels["KSLabel"].value_counts()["good"]
    n_mua = labels["KSLabel"].value_counts()["mua"]

    print(f"Found {len(labels)} clusters ({n_good} goods, {n_mua} mua).")

if __name__ == "__main__":
    args = parse_arguments()
    config = load_config(args.__dict__)
    run_cluster_cutting(config)
Mymi-INS commented 9 months ago

Hi Alessio, I send you an email with the link for the data :) Thanks again for your help !