griffithlab / civicpy

A python interface for the CIViC db application
MIT License
9 stars 5 forks source link

ValueError("Can't use wildcard when searching for non-GRCh37 coordinates") when annotating VCFs with `*` ALT allele #121

Closed mano2991 closed 2 years ago

mano2991 commented 2 years ago

Hi,

I'm trying to annotate my vcf using civicpy. I have installed civicpy using pip. and i gave the below command to annotate it. civicpy annotate-vcf --input-vcf SRR12656923_GATK_filtered.vcf --output-vcf SRR12656923 --reference GRCh38 -i submitted

after a while im getting an error :

Traceback (most recent call last): File "/usr/local/bin/civicpy", line 8, in sys.exit(cli()) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 829, in call return self.main(args, kwargs) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, ctx.params) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 610, in invoke return callback(args, **kwargs) File "/usr/local/lib/python3.7/site-packages/civicpy/cli.py", line 88, in annotate_vcf variants = civic.search_variants_by_coordinates(query, search_mode='exact') File "/usr/local/lib/python3.7/site-packages/civicpy/civic.py", line 1303, in search_variants_by_coordinates raise ValueError("Can't use wildcard when searching for non-GRCh37 coordinates")

SRR12656923_GATK_filtered.vcf.tar.gz i have also enclosed my vcf file for your reference.

susannasiebert commented 2 years ago

Hi @mano2991,

Thank you for interest in CIViCpy. I'm sorry you're encountering problems using our tool. It looks like your VCF contains some variants with a * alt allele. This is not supported by CIViCpy. I would suggest removing these variant entries/alt alleles from your VCF. We will need to update our tool to skip such variants going forward.

susannasiebert commented 2 years ago

For the curious:

ALT — alternate base(s): Comma-separated list of alternate non-reference alleles. These alleles do not have
to be called in any of the samples. Options are base Strings made up of the bases A,C,G,T,N (case insensitive)
or the ‘*’ symbol (allele missing due to overlapping deletion) or a MISSING value ‘.’ (no variant) or an
angle-bracketed ID String (“<ID>”) or a breakend replacement string as described in Section 5.4. If there
are no alternative alleles, then the MISSING value must be used.

(https://samtools.github.io/hts-specs/VCFv4.3.pdf)

We need to skip any variants where the alt allele(s) contain any character but [A, C, G, T].

mano2991 commented 2 years ago

@susannasiebert thanks for your suggestion, i have manually removed the '' alteration in my VCF files now i'm facing a different issue File "/usr/local/bin/civicpy", line 8, in sys.exit(cli()) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 829, in call return self.main(args, kwargs) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, ctx.params) File "/usr/local/lib/python3.7/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/usr/local/lib/python3.7/site-packages/civicpy/cli.py", line 56, in annotate_vcf reader = vcfpy.Reader.from_path(input_vcf) File "/usr/local/lib/python3.7/site-packages/vcfpy/reader.py", line 99, in from_path parsed_samples=parsed_samples, File "/usr/local/lib/python3.7/site-packages/vcfpy/reader.py", line 65, in from_stream parsed_samples=parsed_samples, File "/usr/local/lib/python3.7/site-packages/vcfpy/reader.py", line 121, in init self.header = self.parser.parse_header(parsed_samples) File "/usr/local/lib/python3.7/site-packages/vcfpy/parser.py", line 737, in parse_header self.samples = self._handle_sample_line(parsed_samples) File "/usr/local/lib/python3.7/site-packages/vcfpy/parser.py", line 755, in _handle_sample_line raise exceptions.IncorrectVCFFormat('Missing line starting with "#CHROM"') vcfpy.exceptions.IncorrectVCFFormat: Missing line starting with "#CHROM"

susannasiebert commented 2 years ago

Without seeing the VCF it looks like you might've accidentally removed the header line starting with #CHROM. That line is required to make your file valid VCF format.

mano2991 commented 2 years ago

@susannasiebert i found the error and fixed it thanks for your help