Closed caleblareau closed 6 years ago
misc. gsnap
stuff that seems to be the right answer:
gmap_build -d hg19chrM hg19.fasta -c chrM
gsnap --gunzip -D /apps/lab/aryee/gmap-2018-02-12/share -d hg19chrM miseq/fastq/Bulk1_R1.fastq.gz miseq/fastq/Bulk1_R2.fastq.gz -A sam | samtools view -Sbh - | samtools sort -@ 4 - -o miseq_bulk1.gsnap.bam
https://github.com/juliangehring/GMAP-GSNAP http://research-pub.gene.com/gmap/
XC: Indicates whether the alignment crosses over the origin of a circular chromosome. If so, the string XC:A:+ is printed.
samtools view miseq_bulk1.gsnap.bam | grep XC:A:+
Implemented in gsnap
where there is now documentation in the documents.
the mitochondrial DNA is circular / plasmid like.
Basically, we need a workflow that creates a surrogate second mitochondrial chromomsome that wraps, say, the last 50 BP of the chromosome to the first 50 bp. This should be made into its own chromosome. Then, a new reference genome build for the favorite tool has to be made.
For
mgatk
purposes, we need something intelligent to process 2 chromosome .fasta files of mitochondrial chromosomes that is also sensitive to multi-mapping when filtering the .bam file. And finally, variant quantification has to be more intelligent to handle the multiple chromosome, etc.