Closed mmoisse closed 6 years ago
Any update on this issue? This is something we have run into as well. Maybe we could update the VCF using the fixed vt referenced above?
I'll update the gnomad exomes stuff and ping this issue when it's up. thanks for the reminder and thanks @mmoisse for tracking down the problem.
@brentp
Thanks for updating the exome file.
We're getting an error when we try to run install-data.py
:
wget failed with non-zero exit code 8. Retrying
--2018-08-16 14:34:31-- https://s3.amazonaws.com/gemini-annotations/gnomad.exomes.r2.0.2.sites.no-VEP.nohist.tidy.vcf.gz
Resolving s3.amazonaws.com (s3.amazonaws.com)... 52.216.104.13
Connecting to s3.amazonaws.com (s3.amazonaws.com)|52.216.104.13|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2018-08-16 14:34:31 ERROR 403: Forbidden.
wget failed with non-zero exit code 8. Retrying
Traceback (most recent call last):
File "install-data.py", line 171, in <module>
install_annotation_files(args.anno_dir, args.dl_files, args.extra)
File "install-data.py", line 106, in install_annotation_files
to_dl, anno_dir, cur_config)
File "install-data.py", line 124, in _download_anno_files
cur_config.get("versions", {}).get(orig, 1))
File "install-data.py", line 152, in _download_to_dir
raise ValueError("Failed to download with wget")
ValueError: Failed to download with wget
Does the permission need to be updated?
sorry about that. can you try again?
That worked, thanks. We had some other warnings/errors in our installation using the master branch, which @ponomarevsy posted here. Not sure if these are critical.
@brentp
Thanks for updating gemini v0.30.1.
The problem has occurred again. 'GC_Male' and 'GC_Female' were not found in gnomad.exomes.r2.1.tidy.bcf, as a consequence, the gnomad_num_het is always reported as 0.
And could you include the 'popmax','AF_popmax' in gnomad_v2.1 ?
popmax: Allele frequency information for the outbred population with the highest frequency. This excludes Finns, Ashkenazi Jewish and “Other” populations.
Thanks.
@zhanhuizhang I have pushed a fix for this to master, would you give it a try? You'll have to reload your database. thanks for reporting.
@brentp Thanks for the quick fix. The 'gnomad_num_het' and 'gnomad_num_hom_alt' are corrected, but the gnomad_popmax_af is always -1. Perhaps GEMINI should get 'AF_popmax' from the gnomad_v2.1.bcf, not the 'popmax_AF'. THANKS!
sorry about that. I just pushed a fix. Thanks for noticing.
That worked ~~~ :)
I noticed that for variants that are multi allelic in the Gnomad vcf the number of heterozygous variants is always reported as 0. I believe this is the consequence of the GC_Male and GC_Female INFO fields missing in the parsed Gnomad vcf files at multi allelic loci. While the GC_Male and GC_Female INFO fields are still present in the original Gnomad vcf they are gone after vt decomposition (https://github.com/atks/vt/issues/87). Since the number of heterozygous variants is calulated based on GC_Male an GC_Female, it is wrongly reported as 0 for these multi allelic positions.