cortex-lab / Suite2P

Tools for processing 2P recordings
Other
120 stars 65 forks source link

Suite2p: fast, accurate and complete two-photon pipeline

Registration, cell detection, spike extraction and visualization GUI.

Algorithmic details in http://biorxiv.org/content/early/2016/06/30/061507.

IMG

This code was written by Marius Pachitariu, Carsen Stringer and members of the cortexlab (Kenneth Harris and Matteo Carandini). It is provided here with no warranty. For support, please open an issue directly on github.

For the python version of the code, go here. We recommend using the python code - the documentation is better (see wiki), and the graphical interface has more functionality. We also are going to discontinue updating the matlab version.

Examples

An example dataset (with master_file and make_db) is provided here.

I. Introduction

This is a complete, automated pipeline for processing two-photon Calcium imaging recordings. It is simple, fast and yields a large set of active ROIs. A GUI further provides point-and-click capabilities for refining the results in minutes. The pipeline includes the following steps

  1. X-Y subpixel registration: using a modification of the phase correlation algorithm and subpixel translation in the FFT domain. If a GPU is available, this completes in 20 minutes per 1h of recordings at 30Hz and 512x512 resolution.

  2. SVD decomposition: this provides the input to cell detection and accelerates the algorithm.

  3. Cell detection: using clustering methods in a low-dimensional space. The clustering algorithm provides a positive mask for each ROI identified, and allows for overlaps between masks. There is also an option to perform automated red cell detection.

  4. Signal extraction: by default, all overlapping pixels are discarded when computing the signal inside each ROI, to avoid using "demixing" approaches, which can be biased. The neuropil signal is also computed independently for each ROI, as a weighted pixel average, pooling from a large area around each ROI, but excluding all pixels assigned to ROIs during cell detection.

  5. Spike deconvolution: cell and neuropil traces are further processed to obtain an estimate of spike times and spike "amplitudes". The amplitudes are proportional to the number of spikes in a burst/bin. Even under low SNR conditions, where transients might be hard to identify, the deconvolution is still useful for temporally-localizing responses. The cell traces are baselined using the minimum of the (overly) smoothed trace.

  6. Neuropil subtraction: coefficient is estimated iteratively together with spike deconvolution to minimize the residual of spike deconvolution. The user is encouraged to also try varying this coefficient, to make sure that any scientific results do not depend crucially on it.

  7. Automatic and manual curation: the output of the cell detection algorithm can be visualized and further refined using the included GUI. The GUI is designed to make cell sorting a fun and enjoyable experience. It also includes an automatic classifier that gradually refines itself based on the manual labelling provided by the user. This allows the automated classifier to adapt for different types of data, acquired under different conditions. (README FOR GUI AT https://github.com/cortex-lab/Suite2P/blob/master/gui2P/README.md)

II. Getting started

The toolbox runs in Matlab and currently only supports tiff file inputs. To begin using the toolbox, you will need to make local copies (in a separate folder) of two included files: master_file and make_db. It is important that you make local copies of these files, otherwise updating the repository will overwrite them (and you can lose your files). The make_db file assembles a database of experiments that you would like to be processed in batch. It also adds per-session specific information that the algorithm requires such as the number of imaged planes and channels. The master_file sets general processing options that are applied to all sessions included in make_db, UNLESS the option is over-ridden in the make_db file.

Example database entry

Look into make_db_example for more detailed examples.

The following is a typical database entry in the local make_db file, which can be modelled after make_db_example. The folder structure assumed is RootStorage/mouse_name/date/expts(k) for all entries in expts(k).

i = i+1; 
db(i).mouse_name = 'M150329_MP009'; 
db(i).date = '2015-04-27'; 
db(i).expts = [5 6]; % which experiments to process together

Other (hidden) options are described in make_db_example.m, and at the top of run_pipeline.m (set to reasonable defaults).

Running the pipeline

Change paths in master_file to the paths to your local toolbox and to your data. Then run this function. The master_file creates the ops0 variable and the db0 variable, and runs the main pipeline:

run_pipeline(db, ops);

Spike deconvolution

For spike deconvolution, you need to download the OASIS github (https://github.com/zhoupc/OASIS_matlab) and add the path to this folder on your computer to the top of your master_file

addpath(genpath('pathtoOASIS')))

To run spike deconvolution (after running the pipeline), run

add_deconvolution(ops0, db);

For L0 spike deconvolution, you need to run mex -largeArrayDims SpikeDetection/deconvL0.c (or .cpp under Linux/Mac). If you're on Windows, you will need to install Visual Studio Community in order to mex files in matlab. To choose this deconvolution method, set

ops0.deconvType = 'L0';

See this paper comparing spike deconvolution methods for more information on choosing deconvolution methods/parameters: http://www.biorxiv.org/content/early/2017/06/27/156786

You can also run spike deconvolution without running the entire pipeline by calling wrapperDECONV(ops,F,N), where F and N are the fluorescence and neuropil traces respectively, while ops specifies some deconvolution parameters like sampling rate and sensory decay timescale. See the function help for more information.


Below we describe the outputs of the pipeline first, and then describe the options for setting it up, and customizing it. Importantly, almost all options have pre-specified defaults. Any options specified in master_file in ops0 overrides these defaults. Furthermore, any option specified in the make_db file (experiment specific) overrides both the defaults and the options set in master_file. This allows for flexibility in processing different experiments with different options. The only critical option that you'll need to set is ops0.diameter, or db(N).diameter. This gives the algorithm the scale of the recording, and the size of ROIs you are trying to extract. We recommend as a first run to try the pipeline after setting the diameter option. Depending on the results, you can come back and try changing some of the other options.

Note: some of the options are not specified in either the example master_file or the example make_db file. These are usually more specialized features.

III. Outputs.

The output is a struct called dat which is saved into a mat file in ResultsSavePath using the same subfolder structure, under a name formatted like F_M150329_MP009_2015-04-29_plane1. It contains all the information collected throughout the processing, and contains the ROI and neuropil traces in Fcell and FcellNeu, and whether each ROI j is a cell or not in stat(j).iscell. stat(j) contains information about each ROI j and can be used to recover the corresponding pixels for each ROI in stat.ipix. The centroid of the ROI is specified in stat as well. Here is a summary of where the important results are:

cell traces are in Fcell
neuropil traces are in FcellNeu

deconvolved traces are in sp

Each cell of the above structures is a different experiment from db.expts. manual, GUI overwritten "iscell" labels are in stat.iscell

stat(icell) contains all other information:

There are fields for red cell detection too (see the section on Identifying red cells)

The settings for the registration and the mean image are also output in the ops structure:

IV. Input-output file paths

ResultsSavePath is completed with separate subfolders per animal and experiment, specified in the make_db file. Your data should be stored under a file tree of the form

\RootStorage\mouse_name\session\block*.tif(f)

If you don't want to use this folder structure, see the make_db_example file for alternatives. The make_db_example file also shows how to group together tiffs from different experiments (i.e. different subfolders within this folder structure).

The output is a struct called dat which is saved into a mat file in ResultsSavePath using the same subfolder structure, under a name formatted like F_M150329_MP009_2015-04-29_plane1. It contains all the information collected throughout the processing, and contains the fluorescence traces in dat.Fcell and whether a given ROI is a cell or not in dat.stat(N).iscell. dat.stat contains information about each ROI and can be used to recover the corresponding pixels for each ROI N in dat.stat(N).ipix. The centroid of the ROI N is specified in dat.stat(N) as well.

V. Options

Registration

Block Registration (for high zoom/npixels)

Bidirectional scanning issues (frilly cells - default is to correct)

Recordings with red channel

output ops.mimgRED will contain mean image (if AlignToRedChannel, redMeanImg or REDbinary = 1)

Splitting large tiffs for registration if running out of memory (e.g. 2048 x 2048 pixel images) currently only works with rigid registration, where each section of FOV is registered separately

Cell detection

SVD decomposition

Signal extraction

Neuropil options

if using surround neuropil (signalExtraction = 'surround')

Spike deconvolution

Identifying red cells

use function identify_redcells_sourcery(db, ops0) to identify cells with red

Outputs are appended to stat in F.mat file

Options are

Measures used by classifier

The Suite2p classifier uses a number of features of each ROI to assign cell labels to ROIs. The classifier uses a naive Bayes approach for each feature, and models the distribution of each feature with a non-parametric, adaptively binned empirical distribution. The classifier is initialized with some standard distributions for these features, but is updated continuously with new data samples as the user refines the output manually in the GUI.

The features used are the following (can see values for each ROI by selecting it in the GUI).