This repository describes how to run a pySCENIC gene regulatory network inference analysis alongside a basic "best practices" expression analysis for single-cell data. This includes:
See also the associated publication in Nature Protocols: https://doi.org/10.1038/s41596-020-0336-2.
For an advanced implementation of the steps in this protocol, see VSN Pipelines, a Nextflow DSL2 implementation of pySCENIC with comprehensive and customizable pipelines for expression analysis. This includes additional pySCENIC features (multi-runs, integrated motif- and track-based regulon pruning, loom file generation).
We recommend using this notebook as a template for running an interactive analysis in Jupyter. See the installation instructions for information on setting up a kernel with pySCENIC and other required packages.
The following tools are required to run the steps in this Nextflow pipeline:
The following container images will be pulled by nextflow as needed:
A quick test can be accomplished using the test
profile, which automatically pulls the testing dataset (described in full below):
nextflow run aertslab/SCENICprotocol \
-profile docker,test
This small test dataset takes approximately 70s to run using 6 threads on a standard desktop computer.
Alternately, the same data can be run with a more verbose approach (this is more illustrative for how to substitute other data into the pipeline). Download a minimum set of SCENIC database files for a human dataset (approximately 78 MB).
mkdir example && cd example/
# Transcription factors:
wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/test_TFs_tiny.txt
# Motif to TF annotation database:
wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/motifs.tbl
# Ranking databases:
wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/genome-ranking.feather
# Finally, get a tiny sample expression matrix (loom format):
wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/expr_mat_tiny.loom
Either Docker or Singularity images can be used by specifying the appropriate profile (-profile docker
or -profile singularity
).
Please note that for the tiny test dataset to run successfully, the default thresholds need to be lowered.
nextflow run aertslab/SCENICprotocol \
-profile docker \
--loom_input expr_mat_tiny.loom \
--loom_output pyscenic_integrated-output.loom \
--TFs test_TFs_tiny.txt \
--motifs motifs.tbl \
--db *feather \
--thr_min_genes 1
By default, this pipeline uses the container specified by the --pyscenic_container
parameter.
This is currently set to aertslab/pyscenic:0.9.19
, which uses a container with both pySCENIC and Scanpy 1.4.4.post1
installed.
A custom container can be used (e.g. one built on a local machine) by passing the name of this container to the --pyscenic_container
parameter.
The output of this pipeline is a loom-formatted file (by default: output/pyscenic_integrated-output.loom
) containing: