AllenNeuralDynamics / aind-ephys-spikesort-kilosort25-full

CodeOcean capsule for full electrophysiology analysis pipeline using Kilosort2.5 via SpikeInterface.
MIT License
11 stars 3 forks source link

Ephys processing pipeline Kilosort2.5

Electrophysiology analysis pipeline using Kilosort2.5 via SpikeInterface.

The pipeline includes:

Usage

Input parameters

The run_capsule_*.py scripts in the code folder accept positional or optional arguments.

When using positional argument, up to 7 arguments can be passed, in this STRICT order:

  1. "debug": Whether to run in DEBUG mode (false or true, default false)
  2. "concatenate": Whether to concatenate recordings/segments (false or true. default false)
  3. "denoising strategy": Which denoising strategy to use. Can be cmr (default) or destripe
  4. "remove out channels": Whether to remove out channels (false or true, default true)
  5. "remove bad channels": Whether to remove bad channels (false or true, default true)
  6. "max bad channel fraction": Maximum fraction of bad channels to remove. If more than this fraction, processing is skipped (default 0.5)
  7. "debug duration": Duration of clipped recording in debug mode. Default is 30 seconds. Only used if debug is enabled

The scripts also support the same options as follows:

In addition, the scripts accept the following configuration parameters:

The NWB script also accepts the following parameter:

This parameter is required if multiple electrical series are avaialable in the NWB file (otherwise an error is thrown with the available options).

NOTES ON PARAMETERS: In case --params-file/--params-str are not specified, default parameters are used (see code/processing_params.json file).

For example, one could run:

python run_capsule_*.py true false destripe true false 0.8 30

Or:

python run_capsule_*.py --debug --denoising destripe --no-remove-bad-channels \
                      --max-bad-channel-fraction 0.8 --debug-duration 30

Results organization

The script produces the following output files in the results folder:

Notes on visualization

The processing pipeline assumes that FigURL is correctly set up. If you are planning to use this pipeline extensively, please consider providing your own cloud resources (see Create Kachery Zone)

Local deployment

This pipeline is currently used at AIND on the Code Ocean platform.

The main branch includes includes scripts and resources to run the pipeline locally. In particular, the code/run_capsule_spikeglx.py is designed to run on SpikeGLX datasets. The code/run_capsule_nwb.py is designed to run on an NWB file.

First, let's clone the repo:

git clone https://github.com/AllenNeuralDynamics/aind-capsule-ephys-spikesort-kilosort25-full
cd aind-capsule-ephys-spikesort-kilosort25-full

Next, we need to move the dataset to analyze in the data folder. For example, we can download an NWB file from DANDI (e.g. this dataset) and move it to the data folder:

mkdir data
mv path-to-download-folder/sub-mouse412804_ses-20200803T115732_ecephys.nwb data

Finally, we can start the container (ghcr.io/allenneuraldynamics/aind-ephys-spikesort-kilosort25-full:latest) from the repo base folder (aind-ephys-spikesort-kilosort25-full):

chmod +x ./code/run_nwb
docker run -it --gpus all -v .:/capsule --shm-size 8G \
    --env KACHERY_ZONE --env KACHERY_CLOUD_CLIENT_ID --env KACHERY_CLOUD_PRIVATE_KEY \
    ghcr.io/allenneuraldynamics/aind-ephys-spikesort-kilosort25-full:latest

and run the pipeline:

cd /capsule/code
./run_nwb # + optional parameters (e.g., --debug)

NOTES ON DOCKER RUN:
The --gpu all flag is required to make the GPU available to the container (and Kilosort).
The --shm-size 8G flag is required to increase the shared memory size (default is 64M), which is used internally for parallel processing.
The -v .:/capsule option mounts the current folder . to the /capsule folder in the container, so that the data and scripts are available.
THE FOLDER IS NOT MOUNTED IN READ-ONLY MODE, so be careful when deleting files in the container.
The --env KACHERY_ZONE --env KACHERY_CLOUD_CLIENT_ID --env KACHERY_CLOUD_PRIVATE_KEY flags are required to set up the cloud visualization with FigURL (see Notes on visualization for more details)

Code Ocean deployment

Use the aind branch for a Code Ocean-ready version.

The environment folder contains a Dockerfile to build the container with all required packages.

The code folder contains the scripts to run the analysis (run_capsule_aind.py).

The script assumes that the data in the data folder is organized as follows:

For instructions for local deployment, refer to the Local Deployment section at the end of the page.

Differences between main (local) and aind (Code Ocean) branches

Here is a list of the key changes that are needed:

1. Base Docker image

Code Ocean uses an internal registry of base Docker images. To use the same pipeline locally, the base Docker image in the environment/Dockerfile of the aind branch is changed to:

FROM registry.codeocean.allenneuraldynamics.org/codeocean/kilosort2_5-compiled-base:latest

2. Reading of the data

The first part of the code/run_capsule.py script is dealing with loading the data. This part is clearly tailored to the way we store the data at AIND (see this section). In the main branch, we included two extra run_capsule_* scripts, one for SpikeGLX (run_capsule_spikeglx) and one for NWB files (run_capsule_nwb).

In both cases, we assume that the data folder includes a single dataset (either a SpikeGLX generated folder or a single NWB file).

3. Metadata handling

At AIND, we use aind-data-schema to deal with metadata. The scripts in the main do not have metadata logging using the aind-data-schema.