Closed davmlaw closed 7 months ago
I found this was due to the fasta file used, rather than the synonyms file. I have updated the title and issue
Hi @davmlaw , thank you for reporting this and giving the details.
We are able to reproduce this on our end. We are investigating this further and will let you know once we have an update.
Kind regards, Likhitha
Hi @davmlaw , this issue has now been fixed and will be available in the upcoming release. Thank you and please feel free to open a new issue if there are any other problems.
If I use a NCBI fasta file that has identifiers like "NC_000003.11" rather than "3" for contigs, then it occasionally renames variants to use the contig from the fasta
Attached a GRCh38 file:
vep_synonym.vcf.gz
Command line is:
Removing the "--fasta" argument stops the conversion.
If I download the fasta from the INSTALL.py script - it also goes away
If I modify StructuralVariantOverlap to print what it is sending to tabix (ie
print("Sending tabix: $pos_string_chr\n");
And then grep the output for "NC_" I get:
Version
Tested on RefSeq 110 (GRCh37 and GRCh38) Using Perl 5.30.2
Note on changes / synonym files
I originally thought this had to do with the synonyms file, but I used my own synonym file and it made no difference
I then used the fasta from the VEP download, and it appears to have fixed the issue