KarchinLab / open-cravat

A modular annotation tool for genomic variants
MIT License
113 stars 27 forks source link

mt genome coordinates #117

Closed EugeneEA closed 1 year ago

EugeneEA commented 2 years ago

Dear Rick,

I've stumble upon a question concerning chrM how option -l hg19/hg38 influnce the annotaion of chrM snps? Because there is an ambigouty in reference genomes concerning MT sequence. Broad, for example include rCRS sequence into hg19 https://console.cloud.google.com/storage/browser/gcp-public-data--broad-references/hg19/v0;tab=objects?pageState=(%22StorageObjectListTable%22:(%22f%22:%22%255B%255D%22))&prefix=&forceOnObjectsSortingFiltering=false apparently folowing the same logic as decsribed here https://docs.varsome.com/en/mito-genome-versions#:~:text=The%20standard%20mitochondrial,instead%20of%20NC_001807.4.), whereas ucsc includes NC_001807.4 (https://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.chrom.sizes)

Best, Eugene

EugeneEA commented 2 years ago

PS - the most annoyng fact is that the current release of dbsnp for hg19 uses rCRS https://ftp.ncbi.nih.gov/snp/latest_release/VCF/GCF_000001405.25.gz

mlarsen2 commented 1 year ago

Hi Eugene,

In OpenCravat the -l hg19/hg38 option does not influence the annotation of chrom M variants, as all input variants in hg19 are converted to hg38 coordinates.

We use the GRCh38 reference genome which uses the accession number NC_012920.1 (rCRS) for chromosome M annotations.

Madison

EugeneEA commented 1 year ago

Hi, yes I know that crovat performs liftover of hg19 to hg38 and that is exactly the problem. Some of the hg19 builds already have "hg38" mt version ( rCRS), and therefore all mt coordinates will be incorrect after internal liftover.

Best, Eugene

mlarsen2 commented 1 year ago

Thank you for bringing the chrM issue to our attention, and we have just deployed a fix to this problem. If you have an hg19 input file that also contains hg38 or rCRS coordinates, label the hg38 coordinates as "chrMT" and they will not be lifted over, while keeping the hg19 coordinates as "chrM" will lift them over to hg38 as usual.