Single cell transcriptome analysis pipeline based on DropEst
Indrop-Flow

Indrops analysis pipeline at BioCore@CRG

The pipeline is based on the DropEST tool:


  1. install docker or singularity.
  2. git clone; cd indrop
  3. sh for checking Nextflow and installing bioNextflow

Running the pipeline

The parameters are listed when using nextflow run --help command.

nextflow run --help
N E X T F L O W  ~  version 19.07.0
Launching `` [gigantic_ride] - revision: 17bd0ef49f
BIOCORE@CRG indropSEQ - N F  ~  version 1.0
pairs                         : {PATH}/*R{1,2,3,4}.fastq.gz
genome                        : {PATH}/anno/test.fa.gz
annotation                    : {PATH}/anno/gencode.v28.annotation.gtf
config                        : {PATH}/conf/indrop_v3.xml
barcode_list                  : {PATH}/conf/indrop_v3_barcodes.txt
email                         : yourmail@yourdomain
mtgenes                       : {PATH}/anno/mitoc_genes.txt
version                       : 3_4
library_tag                   : AGATATAA
output (output folder)        : output_v3

You can change them either by using the command line:

nextflow run --pairs "data/{1,2}.fastq.gz" --version 1-2 > log

or changing the params.file You can use the nextflow options for sending the execution in background (-bg) or resuming a failed one (-resume).

nextflow run --pairs "data/{1,2}.fastq.gz" --version 1-2 -bg -resume > log

Indrop versions v1, v2 and v3 are supported

Version 1 and 2

Parameter version: "V1-2"

Version 3

Parameter version: "V3_3"

Parameter version: "V3_4"

The parameter library_tag is only needed with version V3_4


  1. Parameters are specified within the params.config file

The pipeline

  1. QC: Run FastQC on raw reads. It stores the results within QC folder.
  2. Indexing: It makes the index of the genome by using STAR.
  3. dropTag: It creates a "tagged" fastq file with information about the single cell that originated that read in the header.
  4. Alignment: It aligns tagged reads to the indexed genome by using STAR. Reasults are stored in Alignments folder.
  5. dropEst: It provides the estimation of read counts per gene per single cell. The results are in Estimated_counts folder and consists of an R data object, a file with a list of cells (aka barcode combinations), another with a list of genes and a matrix in Matrix Market format (
  6. dropReport: It reads the R data oject produced by the dropEst step to produce a quality report. It needs a list of mitochondrial genes.
  7. multiQC: It wraps the QC from fastQC and STAR mapping in a single output.