xryanglab / cscMap

cscMap is a bioinformatics pipeline to search for the RNA chimeras resulted from fusions of the transcripts encoded by the two opposite DNA strands.
4 stars 3 forks source link

cscMap

cscMap is a bioinformatics pipeline to search for the RNA chimeras resulted from fusions of the transcripts encoded by the two opposite DNA strands.

1. Python dependency

cscMap was written in Python 2.7.16 and has dependencies for popular python libraries:

Also, it depends on some bioinformatics packages.

Tophat, bowtie are used to align the reads to genome and transcriptome. RSEQC(bam2wig.py), BEDOPS(sam2bed), bedtools(intersect) are used to file format conversion.

2. References file

The reference genome of human (hg19) and mouse (mm10) were downloaded from the UCSC Genome Browser, and the reference genomes of other species were from the Ensemble Genome Browser (zebrafish: GRCz11, C. elegans: WBcel235, fruit fly: BDGP6, Saccharomyces: R64-1-1, E.coli: Escherichia_coliK-12_substr.MG1655).

The genome annotation files of both human and mouse were obtained from GENCODE (human: v19, mouse: vM17), and the annotation files of all the other species were from the Ensemble Genome Browser.

The pipeline is required the extra files, for example hg19.chrom.sizes and hg19.fa.out.bed in human, which can get these files from the UCSC Genome Browser(http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/).

3. Run

Copyright and license

Copyright (c) 2020 Yuting Wang The cscMap codes are licensed under THU.