gagneurlab / MMSplice_MTSplice

Tissue-specific variant effect predictions on splicing
MIT License
40 stars 21 forks source link

Adding vcf/gtf chromosome name check #37

Closed nickhsmith closed 4 years ago

nickhsmith commented 4 years ago

When supplying your own GTF file there is no option for matching chromosome names between gtf and vcf files.

When I use the default grch37 or grch38 this is automatically checked using remove_chr_from_chrom_annotation

But I can't invoke this when I am supplying my own gtf. It would be nice to have as an option when my vcf file has no chromosome name (CHROM_NAME = 1) and my gtf file has chromosome names (CHROM_NAME = chr1)

Thanks -Nick

s6juncheng commented 4 years ago

Hi @nickhsmith, thanks, I agree it would be nice to provide this. For now you can do something like sed 's/^chr//g' my.gtf > nochr.gtf to remove all the chr.

258728 commented 4 years ago

maybe, you can make your software more smart to recognize the chromosome with "chr" or no "chr", which is easy for you.

s6juncheng commented 4 years ago

Implemented chromosome name check @MuhammedHasan close the issue for now https://github.com/gagneurlab/MMSplice_MTSplice/blob/cee09e4d7f8ea492e609c10b18caad50484e27f9/mmsplice/vcf_dataloader.py#L72