dnanexus-rnd / GLnexus

Scalable gVCF merging and joint variant calling for population sequencing projects
Apache License 2.0
142 stars 37 forks source link

Deepvariant gvcfs merge error #232

Open Johnnywang92 opened 4 years ago

Johnnywang92 commented 4 years ago

I tried to merge the trios' gvcf files which generate by DeepVariant and the command and error message show like below:

./glnexus_cli --config DeepVariantWES --bed exon.bed HY20080095.gvcf HY20080096.gvcf HY20080097.gvcf

[6563] [2020-08-12 10:18:15.554] [GLnexus] [info] glnexus_cli release v1.2.6-2-gca0e9b7 Aug 10 2020 [6563] [2020-08-12 10:18:15.554] [GLnexus] [warning] jemalloc absent, which will impede performance with high thread counts. See https://github.com/dnanexus-rnd/GLnexus/wiki/Performance [6563] [2020-08-12 10:18:15.554] [GLnexus] [info] Loading config preset DeepVariantWES [6563] [2020-08-12 10:18:15.566] [GLnexus] [info] config: unifier_config: drop_filtered: false min_allele_copy_number: 1 min_AQ1: 35 min_AQ2: 20 min_GQ: 20 max_alleles_per_site: 32 monoallelic_sites_for_lost_alleles: true preference: common genotyper_config: revise_genotypes: true min_assumed_allele_frequency: 9.99999975e-05 required_dp: 0 allow_partial_data: true allele_dp_format: AD ref_dp_format: MIN_DP output_residuals: false more_PL: true squeeze: false trim_uncalled_alleles: true output_format: BCF liftover_fields:

  • {orig_names: [MIN_DP, DP], name: DP, description: "##FORMAT=<ID=DP,Number=1,Type=Integer,Description=\"Approximate read depth (reads with MQ=255 or with bad mates are filtered)\">", type: int, number: basic, default_type: missing, count: 1, combi_method: min, ignore_non_variants: true}
  • {orig_names: [AD], name: AD, description: "##FORMAT=<ID=AD,Number=R,Type=Integer,Description=\"Allelic depths for the ref and alt alleles in the order listed\">", type: int, number: alleles, default_type: zero, count: 0, combi_method: min, ignore_non_variants: false}
  • {orig_names: [GQ], name: GQ, description: "##FORMAT=<ID=GQ,Number=1,Type=Integer,Description=\"Genotype Quality\">", type: int, number: basic, default_type: missing, count: 1, combi_method: min, ignore_non_variants: true}
  • {orig_names: [PL], name: PL, description: "##FORMAT=<ID=PL,Number=G,Type=Integer,Description=\"Phred-scaled genotype Likelihoods\">", type: int, number: genotype, default_type: missing, count: 0, combi_method: missing, ignore_non_variants: true} [6563] [2020-08-12 10:18:15.567] [GLnexus] [info] config CRC32C = 4105299981 [6563] [2020-08-12 10:18:15.567] [GLnexus] [info] init database, exemplar_vcf=HY20080095.gvcf [6563] [2020-08-12 10:18:15.570] [GLnexus] [error] Failed to initialize database: Invalid: RocksDB kInvalidArgument (Invalid argument: Compression type ZSTD is not linked with the binary.)

How to fix that? thanks for your time.

mlin commented 4 years ago

Hi, I infer you compiled GLnexus yourself, is that right? (congrats...)

tl;dr install libzstd-dev or equivalent and rebuild from scratch

RocksDB's configure scripts have a (regrettable imho) feature that detects which compression libraries are installed on the host, and enables only that set. So if Zstandard isn't already installed on the system, it compiles successfully but then fails at runtime when a program like GLnexus wants to use it.