brentp / slivar

genetic variant expressions, annotation, and filtering for great good.
MIT License
251 stars 23 forks source link

make-gnotate issue when specifying multiple vcf/bcfs #54

Open cvlvxi opened 4 years ago

cvlvxi commented 4 years ago

Hey there,

I'm noticing when using make-gnotate with multiple gnomad vcf.gz files which you described as such:

slivar make-gnotate --prefix gnomad-2.1 \
    --field controls_nhomalt:gnomad_nhomalt \
    --field popmax_AF:gnomad_popmax_af \
    gnomad.exomes.r2.1.sites.vcf.bgz \
    gnomad.genomes.r2.1.sites.chr*.vcf.bgz

That it's only creating it for the first file only. Here's how I called it

/some/path/tools/slivar/slivar_0110/slivar make-gnotate --prefix testgnotate --field AC:gnomad_AC --field AN:gnomad_AN --field AN_popmax:gnomad_AN_popmax --field AF:gnomad_AF --field AF_popmax:gnomad_AF_popmax --field AC_male:gnomad_AC_male /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr10_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr21_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr11_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr22_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr12_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr2_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr13_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr3_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr14_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr4_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr15_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr5_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr16_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr6_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr17_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr7_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr18_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr8_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr19_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr9_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr1_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chrX_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr20_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chrY_noVEP.vcf.gz /some/path/gnomad/gnomad_38_liftover/genomes/gnomad.genomes.r2.1.1.sites.liftover_grch38.vcf.bgz

Output

> slivar version: 0.1.10 917aa1a61bd0c0ba50521ea4146a3a4dc45b8b64
@["make-gnotate", "--prefix", "testgnotate", "--field", "AC:gnomad_AC", "--field", "AN:gnomad_AN", "--field", "AN_popmax:gnomad_AN_popmax", "--field", "AF:gnomad_AF", "--field", "AF_popmax:gnomad_AF_popmax", "--field", "AC_male:gnomad_AC_male", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr10_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr11_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr12_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr13_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr14_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr15_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr16_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr17_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr18_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr19_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr1_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr20_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr21_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr22_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr2_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr3_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr4_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr5_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr6_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr7_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr8_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr9_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chrX_noVEP.vcf.gz", "/some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chrY_noVEP.vcf.gz"]
[slivar] using type int for "AC"
[slivar] using type int for "AN"
[slivar] using type int for "AN_popmax"
[slivar] using type float for "AF"
[slivar] using type float for "AF_popmax"
[slivar] using type int for "AC_male"
[slivar] 500000 variants completed. at: 10:103174892. exact: 500000 long: 2924 in /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr10_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr10_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr11_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr12_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr13_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr14_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr15_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr16_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr17_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr18_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr19_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr1_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr20_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr21_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr22_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr2_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr3_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr4_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr5_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr6_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr7_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr8_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chr9_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chrX_noVEP.vcf.gz
[slivar] kvs.len for 10: 663823 after /some/path/gnomad/gnomad_38_liftover/exome2/gnomad.exomes.r2.1.sites.grch38.chrY_noVEP.vcf.gz
[slivar] writing 663823 encoded and 4038 long values for chromosome 10
[slivar] removed 238 duplicated positions by using the value and chromosome: 10

As you can see it just read chr 10

I tried annotating a VCF and I can see it works but only for chr10 regions

brentp commented 4 years ago

yes, I have hit this as well. The current solution is to bcftools concat those files. Then run make-gnotate. As it is now, slivar expects the first file to have all the chromosomes, then it does a region query to get variants for each chromosome in the first file from all other files. This works for example, when combining an exome vcf and a genome vcf (both with all chroms). But doesn't work as you expect here when you have a separate vcf for each chrom.

Thanks for reporting; I agree it's a bug. Not sure when I'll get to fixing it since there is a work-around.

cvlvxi commented 4 years ago

Thanks for confirming that @brentp. Will try the workaround.

Cheers