Closed freedomq8 closed 4 years ago
Ohmm, adding that line is supposed to solve the problem. Let me try.
I cannot reproduce your problem. What I did was
vtools init test
wget ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/clinvar_20180729.vcf.gz
vtools use clinvar-20180729.ann --files clinvar_20180729.vcf.gz
gzcat clinvar_20180729.vcf.gz | head -2000 > data.vcf
vtools import data.vcf
vtools output variant chr pos ref alt clinvar.chr CLNDN
What version of vtools are you using? On which OS?
my OS is ubuntu 17.0
can you output the following
vtools output variant variant.chr variant.pos variant.ref variant.alt variant.region_type variant.region_name variant.mut_type variant.function clinvar.chr clinvar.pos clinvar.name clinva
r.ref clinvar.alt clinvar.qual clinvar.filter clinvar.RS clinvar.AF_ESP clinvar.AF_EXAC clinvar.AF_TGP cl
invar.ALLELEID clinvar.CLNDN clinvar.CLNDNINCL clinvar.CLNDISDB clinvar.CLNDISDBINCL clinvar.CLNHGVS clinva
r.CLNREVSTAT clinvar.CLNSIG clinvar.CLNSIGCONF clinvar.CLNSIGINCL clinvar.CLNVC clinvar.CLNVCSO clinvar.CLN
VI clinvar.DBVARID clinvar.GENEINFO clinvar.MC clinvar.ORIGIN clinvar.SSR > annotation.clinvar.26682.vcf
vtools output variant variant.chr variant.pos variant.ref variant.alt variant.region_type variant.region_name variant.mut_type variant.function clinvar.chr clinvar.pos clinvar.name clinvar.ref clinvar.alt clinvar.qual clinvar.filter clinvar.RS clinvar.AF_ESP clinvar.AF_EXAC clinvar.AF_TGP clinvar.ALLELEID clinvar.CLNDN clinvar.CLNDNINCL clinvar.CLNDISDB clinvar.CLNDISDBINCL clinvar.CLNHGVS clinvar.CLNREVSTAT clinvar.CLNSIG clinvar.CLNSIGCONF clinvar.CLNSIGINCL clinvar.CLNVC clinvar.CLNVCSO clinvar.CLNVI clinvar.DBVARID clinvar.GENEINFO clinvar.MC clinvar.ORIGIN clinvar.SSR > annotation.clinvar.26682.vcf
ERROR: 'ascii' codec can't encode characters in position 115-116: ordinal not in range(128)
I thought I figure it out re-do my ann file but still same issue when I query the above fields. its clinvar.CLNDN and clinvar.CLNVI which are fields from other clinvar database
I cannot test now so you meant that the command would run without clinvar.CLNDN
and clinvar.CLNVI
?
Yes, again, I cannot reproduce on mac so I am closing the ticket. Note that I tried again with the .ann
file adapted to the new 2020 version from ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/clinvar_20200506.vcf.gz.
Hi there, I am trying to add a customized database ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/clinvar_20180729.vcf.gz . I created the .ann file clinvar-20180729.txt
and used it to import source file locally however I always get this msg ERROR: 'ascii' codec can't encode characters in position 115-116: ordinal not in range(128) when trying to annotate my variants.
I changed the encoding of the ann file by adding encoding=ISO-8859-1 to the beginning of the source file. repeated the importing of clinvar vcf and tried to annotate my file but the same error.
tried to change encoding of the original vcf file as well along with ann file using notepad++ but the message persist.
Any idea how to solve this