bioturing / hera-t

Hera-T, a fast and accurate tool for estimating gene abundances in single cell data generated by the 10X-Chromium protocol
17 stars 4 forks source link


We introduce Hera-T, a fast and accurate tool for estimating gene abundances in single cell data generated by the 10X-Chromium protocol. By devising a new strategy for aligning reads to both transcriptome and genome references, Hera-T reduces both running time and memory consumption from 10 to 100 folds while giving similar results compared to CellRanger’s. Hera-T also addresses some difficult splicing alignment scenarios that CellRanger fails to address, and therefore, obtains better accuracy compared to CellRanger. Excluding the reads in those scenarios, Hera-T and CellRanger results have correlation scores > 0.99.


Hera-T is distributed under BioTuring License. See the LICENSE file for details.

Pre-built indexes


sh ./


Usage: ./hera-T count [options] -x <idx_name> -1 <R1> -2 <R2>
-t  : Number of threads
-o  : Output directory name
-p  : Output file prefix
-l  : Library types
        0: 10X-Chromium 3' (v2) protocol
        1: 10X-Chromium 3' (v3) protocol
Example: ./hera-T count -t 32 -o ./result -x index/grch37 -l 0 -1 lane_0.read_1.fq lane_1.read_1.fq -2 lane_0.read_2.fq lane_1.read_2.fq

Example run

1k Brain Cells from an E18 Mouse (v2 chemistry)

Download link:

~ » ls -lah cr_mm10_210/*
-rw-rw-r--@ 1 bioturing  staff   2.5G Nov 14  2018 cr_mm10_210/cr_mm10_210.bwt
-rw-rw-r--@ 1 bioturing  staff   176M Nov 14  2018 cr_mm10_210/cr_mm10_210.fasta
-rw-rw-r--@ 1 bioturing  staff   1.8G Nov 14  2018 cr_mm10_210/cr_mm10_210.hash
-rw-rw-r--@ 1 bioturing  staff   862M Nov 14  2018 cr_mm10_210/
-rw-rw-r--@ 1 bioturing  staff   356B Nov 14  2018 cr_mm10_210/cr_mm10_210.log

~ » ./hera-T count -t 32 -o tmp -x cr_mm10_210/cr_mm10_210 \
           -l 0 \
           -1 neuron_1k_v2_fastqs/neuron_1k_v2_S1_L001_R1_001.fastq.gz \
              neuron_1k_v2_fastqs/neuron_1k_v2_S1_L002_R1_001.fastq.gz \
           -2 neuron_1k_v2_fastqs/neuron_1k_v2_S1_L001_R2_001.fastq.gz \


Hera-T is developed and maintained in BioTuring INC. by:


Thang Tran, Thao Truong, Hy Vuong, Son Pham, “Hera-T: an efficient and accurate approach for quantifying gene abundances from 10X-Chromium data with high rates of non-exonic reads”, biorXiv, 2019 doi:

How to get help

A preferred way to report any problems or ask questions about Hera-T is the issue tracker. Before posting an issue/question, consider to look through the FAQs and existing issues (opened and closed) - it is possible that your question has already been answered.

If you reporting a problem, please include the HeraT.log file and provide some details about your dataset (if possible).

In case you prefer personal communication, please send an email to

Change logs

2018-12-24 (0.1.2) (deprecated):

* Init repo

2018-12-25 (0.1.3) (deprecated):

* Add library types selection
* Write program description to matrix.mtx file

2018-12-27 (0.1.4) (deprecated):

* Fix memory leak in version 0.1.3

2018-12-28 (0.2.0) (deprecated):

* Support Chromium 3' v3 library

2019-03-20 (0.2.1) (release candidate):

* Fix random crash (change from buggy semaphore to lock)

2019-03-25 (0.2.2) (release candidate):

* Fix open all files at once