gimelbrantlab / ASEReadCounter_star

Preprocessing sequencing data for allele-specific analysis
GNU General Public License v3.0
11 stars 5 forks source link

ASEReadCounter* - preprocessing sequencing data for allele-specific analysis

This pipeline goes from RNA-seq (or similar) data to a table of total allelic counts per gene (or other genomic interval). That table serves as input for the further analysis of allelic imbalance with Qllelic.

This is a re-implementation of the ASEReadCounter tool from GATK, based on allelecounter scripts by S.Castel.

The pipeline consists of two main parts:

  1. Reference preparation

    construct individual "paternal" and "maternal" genome references, create heterozygous VCF.

  2. Creation of tables with allelic counts

    • map sequencing reads to references (using STAR aligner; see complete list of dependencies)
    • perform random sampling of the mapped reads to defined depth (key step for overdispersion analysis in Qllelic)
    • count the number of reads mapping to the reference or alternate allele at each heterozygous SNP, and collate the counts for genome intervals (e.g., genes or other features).

Please find manuals / worked examples at Wiki page of this repository.

pic

Installation

Clone this repository to your local machine. No additional installation needed. Please find the information about tool prerequisites at Wiki page.

Citations

Please cite "Unexpected variability of allelic imbalance estimates from RNA sequencing", Mendelevich A., Vinogradova S., Gupta S., Mironov A., Sunyaev S., Gimelbrant A., if you used our pipeline in your work.

Reporting bugs

Please report bugs to the Github issues page.

License

GNU General Public License v3.0