bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
994 stars 354 forks source link

Trio pipeline #2961

Closed kokyriakidis closed 5 years ago

kokyriakidis commented 5 years ago

@chapmanb

1) I would like to run a trio analysis in whole exome samples. Can I use all callers (strelka2, deepvariant. vardict, gatk etc) for a trio analysis with samples having the same batch name? Can I use the ensemble method?

2) I am also trying to do CNV analysis in this trio. Can I add all svcallers? Do all work with single germline sample?

It would also be nice to specify in the documentation:

Which callers can be used for Germline Variant Calling Which callers can only be used for Somatic (Tumor-Normal) Variant Calling Which callers can be used for Germline SV Calling Which callers can only be used for Somatic (Tumor-Normal) SV Calling Which callers can be user for Trio analysis

naumenko-sa commented 5 years ago

Hi Konstantinos @kokyriakidis!

I used gatk4, gatk3.8, samtools, freebayes, playtypus and ensembl for Trio Exome analysis. Yes, just use batch: family_name to call all samples together.

SV/CNV analysis is different. While for small variants you might want to call them in a batch, SV/CNV calling in manta, lumpy, delly, wham is based on split-reads, and you call samples individually.

You may also try to call CNVs using XHMM across many samples with similar coverage (https://atgu.mgh.harvard.edu/xhmm/tutorial.shtml) outside of bcbio.

Sergey

kokyriakidis commented 5 years ago

Hi @naumenko-sa !

Do you recommend the same tools for Trio analysis? Or should I opt for GATK, deepvariant, strelka2, vardict with the ensemble method?

naumenko-sa commented 5 years ago

Hi Konstantinos @kokyriakidis!

More callers + ensemble does not necessarily mean better calling. For SNV calling, I think any tool could give a good precision and sensitivity.

Below is my simple validation from 2018. You may see that for SNV it is enough to use samtools, for indels gatk gives better results.

2018-05_WES validation no_clinical

More extensive validation methods and results you may find among publications by Justin Zook: https://scholar.google.com/citations?hl=en&user=3Xjafy0AAAAJ&view_op=list_works&sortby=pubdate

I'd suggest to run validations using NA12878 and trios (Ashkenasim trio, Chinese trio) using Giab and their article to see how tools are working in your environment.

Bcbio would benefit from any updated validation alike: https://github.com/bcbio/bcbio_validations Especially, if you compared new DeepVariant. That would be a great contribution.

Using many tools and combining them with ensemble would allow you to pick up variants which would not be called by one of the individual tools.

However, this approach has many downsides:

Briefly, for germline WES variant analysis I'd suggest to use gatk4 (or gatk3.8 - which works better for you), and focus more on validation, variant filtration, annotation, and interpretation.

Small variant calling in germline as a bioinformatics problem seems to be (almost) solved at 99% precision and sensitivity (worse for indels, but again 99% if you have WGS data).

Sergey

kokyriakidis commented 5 years ago

@naumenko-sa Thank you for your wonderful explanation!

I want to ask something that bothers me: How does it know which is the father, the mother etc? Doesn't it have to know? I can not understand how this works

roryk commented 5 years ago

https://bcbio-nextgen.readthedocs.io/en/latest/contents/configuration.html#sample-information describes how to do it. https://gatkforums.broadinstitute.org/gatk/discussion/7696/pedigree-ped-files is the file to create.

kokyriakidis commented 5 years ago

@roryk Let's say I have a TRIO:

178F    1   0   0   1   1  #FATHER
178F    2   0   0   2   1  #MOTHER
178F    3   1   2   2   2  #CHILD

The columns are:

Family ID
Individual ID
Paternal ID
Maternal ID
Sex (1=male; 2=female; other=unknown)
Phenotype

How does it match with the samples? Do I have to change their name?

My template is:

details:
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    ensemble:
      numpass: 2
      use_filtered: false
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
    - samtools
    - platypus
    - freebayes
  analysis: variant2
  description: 178F_CHILD
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIAN/178F/input/178F_CHILD_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIAN/178F/input/178F_CHILD_2.fq.gz
  genome_build: GRCh37
  metadata:
    batch: 178F
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    ensemble:
      numpass: 2
      use_filtered: false
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
    - samtools
    - platypus
    - freebayes
  analysis: variant2
  description: 178F_FATHER
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIAN/178F/input/178F_FATHER_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIAN/178F/input/178F_FATHER_2.fq.gz
  genome_build: GRCh37
  metadata:
    batch: 178F
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    ensemble:
      numpass: 2
      use_filtered: false
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
    - samtools
    - platypus
    - freebayes
  analysis: variant2
  description: 178F_MOTHER
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIAN/178F/input/178F_MOTHER_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIAN/178F/input/178F_MOTHER_2.fq.gz
  genome_build: GRCh37
  metadata:
    batch: 178F
fc_name: 178F
upload:
  dir: ../final
roryk commented 5 years ago

The 'description' metadata is what should be the individual ID in the PED file. bcbio will call of the samples in the same batch together, and they will get annotated in the GEMINI database according to the family structure in the PED file. It will also check to make sure the family structure is correct; often times people are unaware what their actual family structure is, so it is good to check.

kokyriakidis commented 5 years ago

Thank you so much for the clarification!

kokyriakidis commented 5 years ago

Any thought why octopus stalls in trio analysis?

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 567, in __call__
    return self.func(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 315, in variantcall_sample
    return genotype.variantcall_sample(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/genotype.py", line 377, in variantcall_sample
    out_file = caller_fn(align_bams, items, ref_file, assoc_files, region, out_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/octopus.py", line 28, in run
    return _run_germline(align_bams, items, ref_file, target, out_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/octopus.py", line 99, in _run_germline
    _produce_compatible_vcf(tx_out_file, items[0])
TypeError: _produce_compatible_vcf() missing 1 required positional argument: 'is_somatic'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 154, in variant2pipeline
    samples = genotype.parallel_variantcall_region(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/genotype.py", line 208, in parallel_variantcall_region
    "vrn_file", ["region", "sam_ref", "config"]))
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/split.py", line 35, in grouped_parallel_split_combine
    final_output = parallel_fn(parallel_name, split_args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 934, in __call__
    self.retrieve()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/multiprocessing/pool.py", line 670, in get
    raise self._value
TypeError: _produce_compatible_vcf() missing 1 required positional argument: 'is_somatic'

Certain chromosomes finished correctly (19/25). After that it pops this error. I tried to rerun but had the same response

chapmanb commented 5 years ago

Thanks much for the report and apologies about the problem. This was a bug in finalizing the octopus VCF files, which is now fixed in the latest development. If you update with (bcbio_nextgen.py upgrade -u development) and re-run in place it should post-process these correctly and hopefully finish cleanly.

kokyriakidis commented 5 years ago

@chapmanb @naumenko-sa When I run the trio pipeline I got this error. Any thoughts?

[2019-10-02T04:56Z] =============================================
[2019-10-02T04:56Z] vcfanno version 0.3.2 [built with go1.12.1]
[2019-10-02T04:56Z] see: https://github.com/brentp/vcfanno
[2019-10-02T04:56Z] =============================================
[2019-10-02T04:56Z] vcfanno.go:115: found 77 sources from 10 files
[2019-10-02T04:56Z] vcfanno.go:145: using 2 worker threads to decompress bgzip file
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AFR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ACR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ANR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Male' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Female' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_genome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] vcfanno.go:194: Info Error: max_aaf_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-02T04:56Z] vcfanno.go:194: Info Error: clinvar_sig not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-02T04:56Z] vcfanno.go:194: Info Error: Hom_Female not found in header >> this error/warning may occur many times. reporting once here...
[2019-10-02T04:57Z] vcfanno.go:194: Info Error: AF_popmax not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-02T04:57Z] vcfanno.go:194: Info Error: CLNSIG not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-02T05:17Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/exac.vcf.gz
[2019-10-02T05:17Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/esp.vcf.gz
[2019-10-02T05:17Z] vcfanno.go:248: annotated 162216 variants in 1277.45 seconds (127.0 / second)
[2019-10-02T05:17Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-ensemble-nomultiallelic-annotated-cre.vcfanno.vcf.gz
[2019-10-02T05:17Z] GEMINI: create database with vcf2db
[2019-10-02T05:17Z] skipping 'DP4' because it has Number=4
[2019-10-02T05:17Z] skipping 'MMQ' because it has Number=R
[2019-10-02T05:17Z] setting vcfanno_gnomad_ac to Type String because it has Number=.
[2019-10-02T05:17Z] setting vcfanno_gnomad_an to Type String because it has Number=.
[2019-10-02T05:17Z] setting vcfanno_gnomad_hom to Type String because it has Number=.
[2019-10-02T05:17Z] Ethnicity None
[2019-10-02T05:17Z] Traceback (most recent call last):
[2019-10-02T05:17Z]   File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
[2019-10-02T05:17Z]     impacts_extras=a.impacts_field, aok=a.a_ok)
[2019-10-02T05:17Z]   File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
[2019-10-02T05:17Z]     self.samples = self.create_samples()
[2019-10-02T05:17Z]   File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
[2019-10-02T05:17Z]     vals = [r[i] for r in rows]
[2019-10-02T05:17Z] IndexError: list index out of range
[2019-10-02T05:17Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-ensemble-nomultiallelic-annotated-cre.vcfanno.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-ensemble-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpygmu3evp/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-ensemble.db
skipping 'DP4' because it has Number=4
skipping 'MMQ' because it has Number=R
setting vcfanno_gnomad_ac to Type String because it has Number=.
setting vcfanno_gnomad_an to Type String because it has Number=.
setting vcfanno_gnomad_hom to Type String because it has Number=.
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 188, in variant2pipeline
    samples = population.prep_db_parallel(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 401, in prep_db_parallel
    output = parallel_fn("prep_gemini_db", to_process)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 921, in __call__
    if self.dispatch_one_batch(iterator):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 549, in __init__
    self.results = batch()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 383, in prep_gemini_db
    return population.prep_gemini_db(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 48, in prep_gemini_db
    gemini_db = create_gemini_db(ann_vcf, data, gemini_db, ped_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 140, in create_gemini_db
    do.run(cmd, "GEMINI: create database with vcf2db")
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-ensemble-nomultiallelic-annotated-cre.vcfanno.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-ensemble-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpygmu3evp/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-ensemble.db
skipping 'DP4' because it has Number=4
skipping 'MMQ' because it has Number=R
setting vcfanno_gnomad_ac to Type String because it has Number=.
setting vcfanno_gnomad_an to Type String because it has Number=.
setting vcfanno_gnomad_hom to Type String because it has Number=.
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.

My template is:

details:
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    ensemble:
      numpass: 2
      use_filtered: false
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
    - strelka2
    - samtools
    - freebayes
    vcfanno:
    - /home/kokyriakidis/cre/cre.vcfanno.conf
  analysis: variant2
  description: 178F_CHILD
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_CHILD_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_CHILD_2.fq.gz
  genome_build: hg38
  metadata:
    batch: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F
    ped: ../178F_PED
    sex: female
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    ensemble:
      numpass: 2
      use_filtered: false
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
    - strelka2
    - samtools
    - freebayes
    vcfanno:
    - /home/kokyriakidis/cre/cre.vcfanno.conf
  analysis: variant2
  description: 178F_FATHER
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_FATHER_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_FATHER_2.fq.gz
  genome_build: hg38
  metadata:
    batch: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F
    ped: ../178F_PED
    sex: male
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    ensemble:
      numpass: 2
      use_filtered: false
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
    - strelka2
    - samtools
    - freebayes
    vcfanno:
    - /home/kokyriakidis/cre/cre.vcfanno.conf
  analysis: variant2
  description: 178F_MOTHER
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_MOTHER_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_MOTHER_2.fq.gz
  genome_build: hg38
  metadata:
    batch: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F
    ped: ../178F_PED
    sex: female
fc_name: 178F
upload:
  dir: ../final

The vcf annotation file I used is the following from the @naumenko-sa cre project:

# using prefix vcfanno to discriminate data from vcfanno and vep
[[annotation]]
file="variation/gnomad_exome.vcf.gz"
fields=["AC","nhomalt","AF_popmax","AN"]
names=["vcfanno_gnomad_ac_es","vcfanno_gnomad_hom_es","vcfanno_gnomad_af_es","vcfanno_gnomad_an_es"]
ops=["first","first","first","first"]

[[annotation]]
file="variation/gnomad_genome.vcf.gz"
fields=["AC", "nhomalt","AF_popmax","AN"]
names=["vcfanno_gnomad_ac_gs", "vcfanno_gnomad_hom_gs","vcfanno_gnomad_af_gs","vcfanno_gnomad_an_gs"]
ops=["self","self","self","self"]

[[postannotation]]
fields=["vcfanno_gnomad_ac_es","vcfanno_gnomad_ac_gs"]
op="sum"
name="vcfanno_gnomad_ac"
type="Integer"

[[postannotation]]
fields=["vcfanno_gnomad_hom_es","vcfanno_gnomad_hom_gs"]
op="sum"
name="vcfanno_gnomad_hom"
type="Integer"

[[postannotation]]
fields=["vcfanno_gnomad_af_es","vcfanno_gnomad_af_gs"]
op="max"
name="vcfanno_gnomad_af_popmax"
type="Float"

[[postannotation]]
fields=["vcfanno_gnomad_an_es","vcfanno_gnomad_an_gs"]
op="sum"
name="vcfanno_gnomad_an"
type="Integer"

[[postannotation]]
fields=["vcfanno_gnomad_ac","vcfanno_gnomad_an"]
op="div2"
name="vcfanno_gnomad_af"
type="Float"

[[annotation]]
file="variation/dbsnp-151.vcf.gz"
fields=["ID"]
names=["rs_ids"]
ops=["concat"]

[[annotation]]
file="variation/clinvar.vcf.gz"
fields=["CLNSIG"]
names=["clinvar_pathogenic"]
ops=["concat"]

#dbNSFP v3.4
[[annotation]]
file = "variation/dbNSFP.txt.gz"
names = ["CADD_phred","phyloP20way_mammalian","phastCons20way_mammalian","Vest3_score","Revel_score","Gerp_score"]
columns = [79,111,115,58,70,107]
ops = ["first","first","first","first","first","first"]
kokyriakidis commented 5 years ago

@chapmanb I get another error from OCTOPUS during it's run:

[2019-10-02T13:17Z] /bin/bash: -c: line 0: syntax error near unexpected token `('
[2019-10-02T13:17Z] /bin/bash: -c: line 0: `set -o pipefail; octopus --threads 1 --reference /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa --reads /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_CHILD/178F_CHILD-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_FATHER/178F_FATHER-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_MOTHER/178F_MOTHER-sort.bam --regions-file ('chr15', 0, 17010044) --working-directory /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0 -o /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr15_0_17010044.vcf.gz --legacy'
[2019-10-02T13:17Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command 'set -o pipefail; octopus --threads 1 --reference /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa --reads /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_CHILD/178F_CHILD-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_FATHER/178F_FATHER-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_MOTHER/178F_MOTHER-sort.bam --regions-file ('chr15', 0, 17010044) --working-directory /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0 -o /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr15_0_17010044.vcf.gz --legacy
/bin/bash: -c: line 0: syntax error near unexpected token `('
/bin/bash: -c: line 0: `set -o pipefail; octopus --threads 1 --reference /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa --reads /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_CHILD/178F_CHILD-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_FATHER/178F_FATHER-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_MOTHER/178F_MOTHER-sort.bam --regions-file ('chr15', 0, 17010044) --working-directory /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0 -o /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr15_0_17010044.vcf.gz --legacy'
' returned non-zero exit status 1.

Then octopus continues to process chromosomes and after a lot of minutes it ends with this:

[2019-10-02T14:14Z] [2019-10-02 17:14:05] <INFO> Finished calling 958,606bp, total runtime 1h 6m
[2019-10-02T14:14Z] [2019-10-02 17:14:05] <INFO> Calls have been written to "/mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpyt5thvtf/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr20_0_32180052.vcf.gz"
[2019-10-02T14:14Z] [2019-10-02 17:14:05] <INFO> Legacy VCF file written to "/mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpyt5thvtf/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr20_0_32180052.legacy.vcf.gz"
[2019-10-02T14:14Z] [2019-10-02 17:14:05] <INFO> Removed 2 temporary files
[2019-10-02T14:14Z] [2019-10-02 17:14:05] <INFO> ------------------------------------------------------------------------
[2019-10-02T14:14Z] Produce compatible VCF output file from octopus
[2019-10-02T14:14Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr20_0_32180052.vcf.gz
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 567, in __call__
    return self.func(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 315, in variantcall_sample
    return genotype.variantcall_sample(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/genotype.py", line 377, in variantcall_sample
    out_file = caller_fn(align_bams, items, ref_file, assoc_files, region, out_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/octopus.py", line 28, in run
    return _run_germline(align_bams, items, ref_file, target, out_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/octopus.py", line 98, in _run_germline
    do.run(cmd.format(**locals()), "Octopus germline calling")
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command 'set -o pipefail; octopus --threads 1 --reference /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa --reads /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_CHILD/178F_CHILD-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_FATHER/178F_FATHER-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_MOTHER/178F_MOTHER-sort.bam --regions-file ('chr15', 0, 17010044) --working-directory /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0 -o /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr15_0_17010044.vcf.gz --legacy
/bin/bash: -c: line 0: syntax error near unexpected token `('
/bin/bash: -c: line 0: `set -o pipefail; octopus --threads 1 --reference /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa --reads /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_CHILD/178F_CHILD-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_FATHER/178F_FATHER-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_MOTHER/178F_MOTHER-sort.bam --regions-file ('chr15', 0, 17010044) --working-directory /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0 -o /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr15_0_17010044.vcf.gz --legacy'
' returned non-zero exit status 1.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 154, in variant2pipeline
    samples = genotype.parallel_variantcall_region(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/genotype.py", line 208, in parallel_variantcall_region
    "vrn_file", ["region", "sam_ref", "config"]))
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/split.py", line 35, in grouped_parallel_split_combine
    final_output = parallel_fn(parallel_name, split_args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 934, in __call__
    self.retrieve()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/multiprocessing/pool.py", line 670, in get
    raise self._value
subprocess.CalledProcessError: Command 'set -o pipefail; octopus --threads 1 --reference /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa --reads /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_CHILD/178F_CHILD-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_FATHER/178F_FATHER-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_MOTHER/178F_MOTHER-sort.bam --regions-file ('chr15', 0, 17010044) --working-directory /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0 -o /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr15_0_17010044.vcf.gz --legacy
/bin/bash: -c: line 0: syntax error near unexpected token `('
/bin/bash: -c: line 0: `set -o pipefail; octopus --threads 1 --reference /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa --reads /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_CHILD/178F_CHILD-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_FATHER/178F_FATHER-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_MOTHER/178F_MOTHER-sort.bam --regions-file ('chr15', 0, 17010044) --working-directory /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0 -o /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr15_0_17010044.vcf.gz --legacy'
' returned non-zero exit status 1.
kokyriakidis@Konstantinos:/mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work$ 
chapmanb commented 5 years ago

Konstantinos; Thanks for the additional testing and the detail bug reports. Sorry you're hitting so many issues here. We'd mostly used octopus for somatic calling tests so far so it hadn't done extensive germline testing, and we've primarily moved to a single variant caller for germline calling rather than ensemble methods. We appreciate you helping work through these issues.

The latest development has two fixes for the two issues:

Please let us know if you hit any other problems and thanks again for all the patience debugging.

kokyriakidis commented 5 years ago

@chapmanb Using just gatk-haptotype I got these error messages now:

[2019-10-03T17:42Z] =============================================
[2019-10-03T17:42Z] vcfanno version 0.3.2 [built with go1.12.1]
[2019-10-03T17:42Z] see: https://github.com/brentp/vcfanno
[2019-10-03T17:42Z] =============================================
[2019-10-03T17:42Z] vcfanno.go:115: found 77 sources from 10 files
[2019-10-03T17:42Z] vcfanno.go:145: using 2 worker threads to decompress bgzip file
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AFR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ACR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ANR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Male' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Female' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_genome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] vcfanno.go:194: Info Error: max_aaf_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-03T17:42Z] vcfanno.go:194: Info Error: clinvar_sig not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-03T17:42Z] vcfanno.go:194: Info Error: Hom_Female not found in header >> this error/warning may occur many times. reporting once here...
[2019-10-03T17:42Z] vcfanno.go:194: Info Error: AF_popmax not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-03T17:43Z] vcfanno.go:194: Info Error: CLNSIG not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-03T18:02Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/exac.vcf.gz
[2019-10-03T18:02Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/esp.vcf.gz
[2019-10-03T18:02Z] vcfanno.go:248: annotated 148115 variants in 1244.76 seconds (119.0 / second)
[2019-10-03T18:02Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-cre.vcfanno.vcf.gz
[2019-10-03T18:02Z] GEMINI: create database with vcf2db
[2019-10-03T18:02Z] skipping 'MMQ' because it has Number=R
[2019-10-03T18:02Z] setting vcfanno_gnomad_ac to Type String because it has Number=.
[2019-10-03T18:02Z] setting vcfanno_gnomad_an to Type String because it has Number=.
[2019-10-03T18:02Z] setting vcfanno_gnomad_hom to Type String because it has Number=.
[2019-10-03T18:02Z] Ethnicity None
[2019-10-03T18:02Z] Traceback (most recent call last):
[2019-10-03T18:02Z]   File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
[2019-10-03T18:02Z]     impacts_extras=a.impacts_field, aok=a.a_ok)
[2019-10-03T18:02Z]   File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
[2019-10-03T18:02Z]     self.samples = self.create_samples()
[2019-10-03T18:02Z]   File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
[2019-10-03T18:02Z]     vals = [r[i] for r in rows]
[2019-10-03T18:02Z] IndexError: list index out of range
[2019-10-03T18:02Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-cre.vcfanno.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpwtu0e31l/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype.db
skipping 'MMQ' because it has Number=R
setting vcfanno_gnomad_ac to Type String because it has Number=.
setting vcfanno_gnomad_an to Type String because it has Number=.
setting vcfanno_gnomad_hom to Type String because it has Number=.
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 188, in variant2pipeline
    samples = population.prep_db_parallel(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 403, in prep_db_parallel
    output = parallel_fn("prep_gemini_db", to_process)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 921, in __call__
    if self.dispatch_one_batch(iterator):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 549, in __init__
    self.results = batch()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 383, in prep_gemini_db
    return population.prep_gemini_db(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 50, in prep_gemini_db
    gemini_db = create_gemini_db(ann_vcf, data, gemini_db, ped_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 142, in create_gemini_db
    do.run(cmd, "GEMINI: create database with vcf2db")
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-cre.vcfanno.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpwtu0e31l/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype.db
skipping 'MMQ' because it has Number=R
setting vcfanno_gnomad_ac to Type String because it has Number=.
setting vcfanno_gnomad_an to Type String because it has Number=.
setting vcfanno_gnomad_hom to Type String because it has Number=.
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.

My template is:

details:
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
    vcfanno:
    - /home/kokyriakidis/cre/cre.vcfanno.conf
  analysis: variant2
  description: 178F_CHILD
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_CHILD_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_CHILD_2.fq.gz
  genome_build: hg38
  metadata:
    batch: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F
    ped: ../178F_PED
    sex: female
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
    vcfanno:
    - /home/kokyriakidis/cre/cre.vcfanno.conf
  analysis: variant2
  description: 178F_FATHER
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_FATHER_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_FATHER_2.fq.gz
  genome_build: hg38
  metadata:
    batch: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F
    ped: ../178F_PED
    sex: male
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
    vcfanno:
    - /home/kokyriakidis/cre/cre.vcfanno.conf
  analysis: variant2
  description: 178F_MOTHER
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_MOTHER_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_MOTHER_2.fq.gz
  genome_build: hg38
  metadata:
    batch: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F
    ped: ../178F_PED
    sex: female
fc_name: 178F
upload:
  dir: ../final

My vcf annotation files is:

# using prefix vcfanno to discriminate data from vcfanno and vep
[[annotation]]
file="variation/gnomad_exome.vcf.gz"
fields=["AC","nhomalt","AF_popmax","AN"]
names=["vcfanno_gnomad_ac_es","vcfanno_gnomad_hom_es","vcfanno_gnomad_af_es","vcfanno_gnomad_an_es"]
ops=["first","first","first","first"]

[[annotation]]
file="variation/gnomad_genome.vcf.gz"
fields=["AC", "nhomalt","AF_popmax","AN"]
names=["vcfanno_gnomad_ac_gs", "vcfanno_gnomad_hom_gs","vcfanno_gnomad_af_gs","vcfanno_gnomad_an_gs"]
ops=["self","self","self","self"]

[[postannotation]]
fields=["vcfanno_gnomad_ac_es","vcfanno_gnomad_ac_gs"]
op="sum"
name="vcfanno_gnomad_ac"
type="Integer"

[[postannotation]]
fields=["vcfanno_gnomad_hom_es","vcfanno_gnomad_hom_gs"]
op="sum"
name="vcfanno_gnomad_hom"
type="Integer"

[[postannotation]]
fields=["vcfanno_gnomad_af_es","vcfanno_gnomad_af_gs"]
op="max"
name="vcfanno_gnomad_af_popmax"
type="Float"

[[postannotation]]
fields=["vcfanno_gnomad_an_es","vcfanno_gnomad_an_gs"]
op="sum"
name="vcfanno_gnomad_an"
type="Integer"

[[postannotation]]
fields=["vcfanno_gnomad_ac","vcfanno_gnomad_an"]
op="div2"
name="vcfanno_gnomad_af"
type="Float"

[[annotation]]
file="variation/dbsnp-151.vcf.gz"
fields=["ID"]
names=["rs_ids"]
ops=["concat"]

[[annotation]]
file="variation/clinvar.vcf.gz"
fields=["CLNSIG"]
names=["clinvar_pathogenic"]
ops=["concat"]

#dbNSFP v3.4
[[annotation]]
file = "variation/dbNSFP.txt.gz"
names = ["CADD_phred","phyloP20way_mammalian","phastCons20way_mammalian","Vest3_score","Revel_score","Gerp_score"]
columns = [79,111,115,58,70,107]
ops = ["first","first","first","first","first","first"]
roryk commented 5 years ago

I'm guessing your PED file is malformed somehow, can you pass on the PED file you are using?

kokyriakidis commented 5 years ago

@roryk

My PED file:

178F    178F_FATHER 0   0   1   1
178F    178F_MOTHER 0   0   2   1
178F    178F_CHILD  178F_FATHER 178F_MOTHER 2   2
kokyriakidis commented 5 years ago

Got the same error when I did not specify a vcfanno annotation file. So this file does not cause the problem

kokyriakidis commented 5 years ago

I tried running it WITHOUT a PED file and I got these errors:

[2019-10-05T12:04Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-decompose.vcf.gz
[2019-10-05T12:04Z] Annotating /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic.vcf.gz with vcfanno, using /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-cre.vcfanno-combine.conf
[2019-10-05T12:04Z] =============================================
[2019-10-05T12:04Z] vcfanno version 0.3.2 [built with go1.12.1]
[2019-10-05T12:04Z] see: https://github.com/brentp/vcfanno
[2019-10-05T12:04Z] =============================================
[2019-10-05T12:04Z] vcfanno.go:115: found 77 sources from 10 files
[2019-10-05T12:04Z] vcfanno.go:145: using 2 worker threads to decompress bgzip file
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AFR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ACR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ANR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Male' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Female' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_genome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] vcfanno.go:194: Info Error: max_aaf_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T12:04Z] vcfanno.go:194: Info Error: Hom_Female not found in header >> this error/warning may occur many times. reporting once here...
[2019-10-05T12:04Z] vcfanno.go:194: Info Error: clinvar_sig not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T12:04Z] vcfanno.go:194: Info Error: AF_popmax not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T12:05Z] vcfanno.go:194: Info Error: CLNSIG not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T12:16Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/exac.vcf.gz
[2019-10-05T12:16Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/esp.vcf.gz
[2019-10-05T12:16Z] vcfanno.go:248: annotated 148115 variants in 738.52 seconds (200.6 / second)
[2019-10-05T12:16Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-cre.vcfanno.vcf.gz
[2019-10-05T12:16Z] GEMINI: create database with vcf2db
[2019-10-05T12:16Z] skipping 'MMQ' because it has Number=R
[2019-10-05T12:16Z] setting vcfanno_gnomad_ac to Type String because it has Number=.
[2019-10-05T12:16Z] setting vcfanno_gnomad_an to Type String because it has Number=.
[2019-10-05T12:16Z] setting vcfanno_gnomad_hom to Type String because it has Number=.
[2019-10-05T12:16Z] /RED/RESOURCES/bcbio/anaconda/envs/python2/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value '_mnt_36642bae-9ec9-4100-8...'. (this warning may be suppressed after 10 occurrences)
[2019-10-05T12:16Z]   (util.ellipses_string(value),),
[2019-10-05T12:16Z] /RED/RESOURCES/bcbio/anaconda/envs/python2/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value '-9'. (this warning may be suppressed after 10 occurrences)
[2019-10-05T12:16Z]   (util.ellipses_string(value),),
[2019-10-05T12:16Z] /RED/RESOURCES/bcbio/anaconda/envs/python2/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value '2'. (this warning may be suppressed after 10 occurrences)
[2019-10-05T12:16Z]   (util.ellipses_string(value),),
[2019-10-05T12:16Z] bad record:
[2019-10-05T12:16Z] AC 3
[2019-10-05T12:16Z] AF 0.5
[2019-10-05T12:16Z] AN 6
[2019-10-05T12:16Z] BaseQRankSum 0.216999992728
[2019-10-05T12:16Z] CADD_phred 0.015
[2019-10-05T12:16Z] ClippingRankSum 0.799000024796
[2019-10-05T12:16Z] DB None
[2019-10-05T12:16Z] DECOMPOSED None
[2019-10-05T12:16Z] DP 47
[2019-10-05T12:16Z] ExcessHet 6.98969984055
[2019-10-05T12:16Z] FS 5.49700021744
[2019-10-05T12:16Z] Gerp_score -1.86
[2019-10-05T12:16Z] LEN None
[2019-10-05T12:16Z] MLEAC 3
[2019-10-05T12:16Z] MLEAF 0.5
[2019-10-05T12:16Z] MMQ (27, 27)
[2019-10-05T12:16Z] MQ 31.3999996185
[2019-10-05T12:16Z] MQ0 0
[2019-10-05T12:16Z] MQRankSum -1.1779999733
[2019-10-05T12:16Z] OLD_MULTIALLELIC None
[2019-10-05T12:16Z] OLD_VARIANT None
[2019-10-05T12:16Z] QD 7.78000020981
[2019-10-05T12:16Z] ReadPosRankSum -0.070000000298
[2019-10-05T12:16Z] Revel_score 0.012
[2019-10-05T12:16Z] SOR 2.3900001049
[2019-10-05T12:16Z] TYPE None
[2019-10-05T12:16Z] Vest3_score 0.125,0.052,0.121,0.06
[2019-10-05T12:16Z] aa_change None
[2019-10-05T12:16Z] aa_length None
[2019-10-05T12:16Z] aaf 0.5
[2019-10-05T12:16Z] ac 3
[2019-10-05T12:16Z] ac_adj_exac_afr None
[2019-10-05T12:16Z] ac_adj_exac_amr None
[2019-10-05T12:16Z] ac_adj_exac_eas None
[2019-10-05T12:16Z] ac_adj_exac_fin None
[2019-10-05T12:16Z] ac_adj_exac_nfe None
[2019-10-05T12:16Z] ac_adj_exac_oth None
[2019-10-05T12:16Z] ac_adj_exac_sas None
[2019-10-05T12:16Z] ac_exac_all None
[2019-10-05T12:16Z] af 0.5
[2019-10-05T12:16Z] af_adj_exac_afr -1.0
[2019-10-05T12:16Z] af_adj_exac_amr -1.0
[2019-10-05T12:16Z] af_adj_exac_eas -1.0
[2019-10-05T12:16Z] af_adj_exac_fin -1.0
[2019-10-05T12:16Z] af_adj_exac_nfe -1.0
[2019-10-05T12:16Z] af_adj_exac_oth -1.0
[2019-10-05T12:16Z] af_adj_exac_sas -1.0
[2019-10-05T12:16Z] af_esp_aa -1.0
[2019-10-05T12:16Z] af_esp_all -1.0
[2019-10-05T12:16Z] af_esp_ea -1.0
[2019-10-05T12:16Z] af_exac_all -1.0
[2019-10-05T12:16Z] alt A
[2019-10-05T12:16Z] an 6
[2019-10-05T12:16Z] an_adj_exac_afr -1.0
[2019-10-05T12:16Z] an_adj_exac_amr -1.0
[2019-10-05T12:16Z] an_adj_exac_eas -1.0
[2019-10-05T12:16Z] an_adj_exac_fin -1.0
[2019-10-05T12:16Z] an_adj_exac_nfe -1.0
[2019-10-05T12:16Z] an_adj_exac_oth -1.0
[2019-10-05T12:16Z] an_adj_exac_sas -1.0
[2019-10-05T12:16Z] an_exac_all -1.0
[2019-10-05T12:16Z] baseqranksum 0.216999992728
[2019-10-05T12:16Z] biotype None
[2019-10-05T12:16Z] cadd_phred 0.015
[2019-10-05T12:16Z] call_rate 1.0
[2019-10-05T12:16Z] chrom chr1
[2019-10-05T12:16Z] clinvar_pathogenic None
[2019-10-05T12:16Z] clippingranksum 0.799000024796
[2019-10-05T12:16Z] codon_change None
[2019-10-05T12:16Z] common_pathogenic False
[2019-10-05T12:16Z] db False
[2019-10-05T12:16Z] decomposed False
[2019-10-05T12:16Z] dp 47
[2019-10-05T12:16Z] ds False
[2019-10-05T12:16Z] effect_severity None
[2019-10-05T12:16Z] end 13263050
[2019-10-05T12:16Z] ensembl_gene_id None
[2019-10-05T12:16Z] excesshet 6.98969984055
[2019-10-05T12:16Z] exon None
[2019-10-05T12:16Z] filter None
[2019-10-05T12:16Z] fs 5.49700021744
[2019-10-05T12:16Z] gene None
[2019-10-05T12:16Z] gerp_score -1.86
[2019-10-05T12:16Z] gnomad_af -1.0
[2019-10-05T12:16Z] gnomad_af_afr -1.0
[2019-10-05T12:16Z] gnomad_af_amr -1.0
[2019-10-05T12:16Z] gnomad_af_asj -1.0
[2019-10-05T12:16Z] gnomad_af_eas -1.0
[2019-10-05T12:16Z] gnomad_af_fin -1.0
[2019-10-05T12:16Z] gnomad_af_nfe -1.0
[2019-10-05T12:16Z] gnomad_af_oth -1.0
[2019-10-05T12:16Z] gnomad_af_popmax -1.0
[2019-10-05T12:16Z] gnomad_af_sas -1.0
[2019-10-05T12:16Z] gt_alt_depths i
                                   
�5�?9-10-05T12:16Z] gt_alt_freqs d\(������?�?^Cy
[2019-10-05T12:16Z] gt_depths i
                               ,
[2019-10-05T12:16Z] gt_phases ?
[2019-10-05T12:16Z] gt_quals f
                              ,�B<B�B
[2019-10-05T12:16Z] gt_ref_depths i
                                   ,
                                    
[2019-10-05T12:16Z] gt_types i
                              ,
[2019-10-05T12:16Z] gts S
                         (G/AG/AG/A
[2019-10-05T12:16Z] impact None
[2019-10-05T12:16Z] impact_severity None
[2019-10-05T12:16Z] impact_so None
[2019-10-05T12:16Z] is_canonical False
[2019-10-05T12:16Z] is_coding False
[2019-10-05T12:16Z] is_exonic False
[2019-10-05T12:16Z] is_lof False
[2019-10-05T12:16Z] is_splicing False
[2019-10-05T12:16Z] len None
[2019-10-05T12:16Z] max_aaf_all -1.0
[2019-10-05T12:16Z] mleac 3
[2019-10-05T12:16Z] mleaf 0.5
[2019-10-05T12:16Z] mmq (27, 27)
[2019-10-05T12:16Z] mq 31.3999996185
[2019-10-05T12:16Z] mq0 0
[2019-10-05T12:16Z] mqranksum -1.1779999733
[2019-10-05T12:16Z] num_exac_Het None
[2019-10-05T12:16Z] num_exac_Hom None
[2019-10-05T12:16Z] num_exac_het None
[2019-10-05T12:16Z] num_exac_hom None
[2019-10-05T12:16Z] num_het 3
[2019-10-05T12:16Z] num_hom_alt 0
[2019-10-05T12:16Z] num_hom_ref 0
[2019-10-05T12:16Z] num_unknown 0
[2019-10-05T12:16Z] old_multiallelic None
[2019-10-05T12:16Z] old_variant None
[2019-10-05T12:16Z] phastCons20way_mammalian 0.000000
[2019-10-05T12:16Z] phastcons20way_mammalian 0.000000
[2019-10-05T12:16Z] phyloP20way_mammalian -2.018000
[2019-10-05T12:16Z] phylop20way_mammalian -2.018000
[2019-10-05T12:16Z] polyphen_pred None
[2019-10-05T12:16Z] polyphen_score None
[2019-10-05T12:16Z] qd 7.78000020981
[2019-10-05T12:16Z] qual 357.899993896
[2019-10-05T12:16Z] readposranksum -0.070000000298
[2019-10-05T12:16Z] ref G
[2019-10-05T12:16Z] revel_score 0.012
[2019-10-05T12:16Z] rs_ids rs2359265
[2019-10-05T12:16Z] sift_pred None
[2019-10-05T12:16Z] sift_score None
[2019-10-05T12:16Z] so None
[2019-10-05T12:16Z] sor 2.3900001049
[2019-10-05T12:16Z] start 13263049
[2019-10-05T12:16Z] sub_type ts
[2019-10-05T12:16Z] top_consequence None
[2019-10-05T12:16Z] transcript None
[2019-10-05T12:16Z] type snp
[2019-10-05T12:16Z] variant_id 1558
[2019-10-05T12:16Z] vcf_id rs2359265
[2019-10-05T12:16Z] vcfanno_gnomad_ac 550
[2019-10-05T12:16Z] vcfanno_gnomad_ac_es 55
[2019-10-05T12:16Z] vcfanno_gnomad_ac_gs 495
[2019-10-05T12:16Z] vcfanno_gnomad_af 0.0150499995798
[2019-10-05T12:16Z] vcfanno_gnomad_af_es 0.00169539998751
[2019-10-05T12:16Z] vcfanno_gnomad_af_gs 0.188199996948
[2019-10-05T12:16Z] vcfanno_gnomad_af_popmax 0.188199996948
[2019-10-05T12:16Z] vcfanno_gnomad_an 36544
[2019-10-05T12:16Z] vcfanno_gnomad_an_es 36544
[2019-10-05T12:16Z] vcfanno_gnomad_an_gs (4698, 3254)
[2019-10-05T12:16Z] vcfanno_gnomad_hom 29
[2019-10-05T12:16Z] vcfanno_gnomad_hom_es 23
[2019-10-05T12:16Z] vcfanno_gnomad_hom_gs 6
[2019-10-05T12:16Z] vest3_score 0.125,0.052,0.121,0.06
[2019-10-05T12:16Z] Traceback (most recent call last):
[2019-10-05T12:16Z]   File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
[2019-10-05T12:16Z]     impacts_extras=a.impacts_field, aok=a.a_ok)
[2019-10-05T12:16Z]   File "/RED/TOOLS/bin/vcf2db.py", line 234, in __init__
[2019-10-05T12:16Z]     self.load()
[2019-10-05T12:16Z]   File "/RED/TOOLS/bin/vcf2db.py", line 319, in load
[2019-10-05T12:16Z]     i = self._load(self.cache, create=True, start=1)
[2019-10-05T12:16Z]   File "/RED/TOOLS/bin/vcf2db.py", line 312, in _load
[2019-10-05T12:16Z]     self.insert(variants, expanded, keys, i, create=create)
[2019-10-05T12:16Z]   File "/RED/TOOLS/bin/vcf2db.py", line 374, in insert
[2019-10-05T12:16Z]     vilengths, variant_impacts)
[2019-10-05T12:16Z]   File "/RED/TOOLS/bin/vcf2db.py", line 402, in _insert
[2019-10-05T12:16Z]     self.__insert(v_objs, self.metadata.tables['variants'].insert())
[2019-10-05T12:16Z]   File "/RED/TOOLS/bin/vcf2db.py", line 436, in __insert
[2019-10-05T12:16Z]     raise e
[2019-10-05T12:16Z] sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 117 - probably unsupported type.
[2019-10-05T12:16Z] [SQL: INSERT INTO variants (variant_id, chrom, start, "end", vcf_id, ref, alt, qual, filter, type, sub_type, call_rate, num_hom_ref, num_het, num_hom_alt, num_unknown, aaf, gene, ensembl_gene_id, transcript, is_exonic, is_coding, is_lof, is_splicing, is_canonical, exon, codon_change, aa_change, aa_length, biotype, impact, impact_so, impact_severity, polyphen_pred, polyphen_score, sift_pred, sift_score, ac, af, an, baseqranksum, cadd_phred, clippingranksum, db, decomposed, dp, ds, excesshet, fs, gerp_score, len, mleac, mleaf, mq, mq0, mqranksum, old_multiallelic, old_variant, qd, readposranksum, revel_score, sor, vest3_score, ac_adj_exac_afr, ac_adj_exac_amr, ac_adj_exac_eas, ac_adj_exac_fin, ac_adj_exac_nfe, ac_adj_exac_oth, ac_adj_exac_sas, ac_exac_all, af_adj_exac_afr, af_adj_exac_amr, af_adj_exac_eas, af_adj_exac_fin, af_adj_exac_nfe, af_adj_exac_oth, af_adj_exac_sas, af_esp_aa, af_esp_all, af_esp_ea, af_exac_all, an_adj_exac_afr, an_adj_exac_amr, an_adj_exac_eas, an_adj_exac_fin, an_adj_exac_nfe, an_adj_exac_oth, an_adj_exac_sas, an_exac_all, clinvar_pathogenic, common_pathogenic, gnomad_af, gnomad_af_afr, gnomad_af_amr, gnomad_af_asj, gnomad_af_eas, gnomad_af_fin, gnomad_af_nfe, gnomad_af_oth, gnomad_af_popmax, gnomad_af_sas, max_aaf_all, num_exac_het, num_exac_hom, phastcons20way_mammalian, phylop20way_mammalian, rs_ids, vcfanno_gnomad_ac, vcfanno_gnomad_ac_es, vcfanno_gnomad_ac_gs, vcfanno_gnomad_af, vcfanno_gnomad_af_es, vcfanno_gnomad_af_gs, vcfanno_gnomad_af_popmax, vcfanno_gnomad_an, vcfanno_gnomad_an_es, vcfanno_gnomad_an_gs, vcfanno_gnomad_hom, vcfanno_gnomad_hom_es, vcfanno_gnomad_hom_gs, gts, gt_types, gt_phases, gt_depths, gt_ref_depths, gt_alt_depths, gt_quals, gt_alt_freqs) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[2019-10-05T12:16Z] [parameters: (1558, u'chr1', 13263049, 13263050, u'rs2359265', u'G', u'A', 357.8999938964844, None, 'snp', 'ts', 1.0, 0, 3, 0, 0, 0.5, None, None, None, 0, 0, 0, 0, 0, None, None, None, None, None, None, None, None, None, None, None, None, 3, 0.5, 6, 0.21699999272823334, u'0.015', 0.7990000247955322, 0, 0, 47, 0, 6.989699840545654, 5.497000217437744, u'-1.86', None, 3, 0.5, 31.399999618530273, 0, -1.1779999732971191, None, 'None', 7.78000020980835, -0.07000000029802322, u'0.012', 2.390000104904175, u'0.125,0.052,0.121,0.06', None, None, None, None, None, None, None, None, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, None, 0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, None, None, u'0.000000', u'-2.018000', u'rs2359265', 550, 55, 495, 0.015049999579787254, 0.0016953999875113368, 0.1881999969482422, 0.1881999969482422, 36544, 36544, (4698, 3254), 29, 23, 6, <read-only buffer for 0x7f1e05aa7110, size -1, offset 0 at 0x7f1e035287f0>, <read-only buffer for 0x7f1e05aa70d8, size -1, offset 0 at 0x7f1e035287b0>, <read-only buffer for 0x7f1e05b25c30, size -1, offset 0 at 0x7f1e03528770>, <read-only buffer for 0x7f1e05aa7148, size -1, offset 0 at 0x7f1e03528670>, <read-only buffer for 0x7f1e05aa7180, size -1, offset 0 at 0x7f1e03528730>, <read-only buffer for 0x7f1e05aa71b8, size -1, offset 0 at 0x7f1e035286f0>, <read-only buffer for 0x7f1e05aa71f0, size -1, offset 0 at 0x7f1e035286b0>, <read-only buffer for 0x7f1e05b249b0, size -1, offset 0 at 0x7f1e03528630>)]
[2019-10-05T12:16Z] (Background on this error at: http://sqlalche.me/e/rvf5)
[2019-10-05T12:16Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-cre.vcfanno.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpty6jb_71/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype.db
gnomad_af -1.0
gnomad_af_afr -1.0
gnomad_af_amr -1.0
gnomad_af_asj -1.0
gnomad_af_eas -1.0
gnomad_af_fin -1.0
gnomad_af_nfe -1.0
gnomad_af_oth -1.0
gnomad_af_popmax -1.0
gnomad_af_sas -1.0
gt_alt_depths i
               
�5�?lt_freqs d\(������?�?^Cy
gt_depths i
           ,
gt_phases ?
gt_quals f
          ,�B<B�B
gt_ref_depths i
               ,
                
gt_types i
          ,
gts S
     (G/AG/AG/A
impact None
impact_severity None
impact_so None
is_canonical False
is_coding False
is_exonic False
is_lof False
is_splicing False
len None
max_aaf_all -1.0
mleac 3
mleaf 0.5
mmq (27, 27)
mq 31.3999996185
mq0 0
mqranksum -1.1779999733
num_exac_Het None
num_exac_Hom None
num_exac_het None
num_exac_hom None
num_het 3
num_hom_alt 0
num_hom_ref 0
num_unknown 0
old_multiallelic None
old_variant None
phastCons20way_mammalian 0.000000
phastcons20way_mammalian 0.000000
phyloP20way_mammalian -2.018000
phylop20way_mammalian -2.018000
polyphen_pred None
polyphen_score None
qd 7.78000020981
qual 357.899993896
readposranksum -0.070000000298
ref G
revel_score 0.012
rs_ids rs2359265
sift_pred None
sift_score None
so None
sor 2.3900001049
start 13263049
sub_type ts
top_consequence None
transcript None
type snp
variant_id 1558
vcf_id rs2359265
vcfanno_gnomad_ac 550
vcfanno_gnomad_ac_es 55
vcfanno_gnomad_ac_gs 495
vcfanno_gnomad_af 0.0150499995798
vcfanno_gnomad_af_es 0.00169539998751
vcfanno_gnomad_af_gs 0.188199996948
vcfanno_gnomad_af_popmax 0.188199996948
vcfanno_gnomad_an 36544
vcfanno_gnomad_an_es 36544
vcfanno_gnomad_an_gs (4698, 3254)
vcfanno_gnomad_hom 29
vcfanno_gnomad_hom_es 23
vcfanno_gnomad_hom_gs 6
vest3_score 0.125,0.052,0.121,0.06
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 234, in __init__
    self.load()
  File "/RED/TOOLS/bin/vcf2db.py", line 319, in load
    i = self._load(self.cache, create=True, start=1)
  File "/RED/TOOLS/bin/vcf2db.py", line 312, in _load
    self.insert(variants, expanded, keys, i, create=create)
  File "/RED/TOOLS/bin/vcf2db.py", line 374, in insert
    vilengths, variant_impacts)
  File "/RED/TOOLS/bin/vcf2db.py", line 402, in _insert
    self.__insert(v_objs, self.metadata.tables['variants'].insert())
  File "/RED/TOOLS/bin/vcf2db.py", line 436, in __insert
    raise e
sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 117 - probably unsupported type.
[SQL: INSERT INTO variants (variant_id, chrom, start, "end", vcf_id, ref, alt, qual, filter, type, sub_type, call_rate, num_hom_ref, num_het, num_hom_alt, num_unknown, aaf, gene, ensembl_gene_id, transcript, is_exonic, is_coding, is_lof, is_splicing, is_canonical, exon, codon_change, aa_change, aa_length, biotype, impact, impact_so, impact_severity, polyphen_pred, polyphen_score, sift_pred, sift_score, ac, af, an, baseqranksum, cadd_phred, clippingranksum, db, decomposed, dp, ds, excesshet, fs, gerp_score, len, mleac, mleaf, mq, mq0, mqranksum, old_multiallelic, old_variant, qd, readposranksum, revel_score, sor, vest3_score, ac_adj_exac_afr, ac_adj_exac_amr, ac_adj_exac_eas, ac_adj_exac_fin, ac_adj_exac_nfe, ac_adj_exac_oth, ac_adj_exac_sas, ac_exac_all, af_adj_exac_afr, af_adj_exac_amr, af_adj_exac_eas, af_adj_exac_fin, af_adj_exac_nfe, af_adj_exac_oth, af_adj_exac_sas, af_esp_aa, af_esp_all, af_esp_ea, af_exac_all, an_adj_exac_afr, an_adj_exac_amr, an_adj_exac_eas, an_adj_exac_fin, an_adj_exac_nfe, an_adj_exac_oth, an_adj_exac_sas, an_exac_all, clinvar_pathogenic, common_pathogenic, gnomad_af, gnomad_af_afr, gnomad_af_amr, gnomad_af_asj, gnomad_af_eas, gnomad_af_fin, gnomad_af_nfe, gnomad_af_oth, gnomad_af_popmax, gnomad_af_sas, max_aaf_all, num_exac_het, num_exac_hom, phastcons20way_mammalian, phylop20way_mammalian, rs_ids, vcfanno_gnomad_ac, vcfanno_gnomad_ac_es, vcfanno_gnomad_ac_gs, vcfanno_gnomad_af, vcfanno_gnomad_af_es, vcfanno_gnomad_af_gs, vcfanno_gnomad_af_popmax, vcfanno_gnomad_an, vcfanno_gnomad_an_es, vcfanno_gnomad_an_gs, vcfanno_gnomad_hom, vcfanno_gnomad_hom_es, vcfanno_gnomad_hom_gs, gts, gt_types, gt_phases, gt_depths, gt_ref_depths, gt_alt_depths, gt_quals, gt_alt_freqs) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: (1558, u'chr1', 13263049, 13263050, u'rs2359265', u'G', u'A', 357.8999938964844, None, 'snp', 'ts', 1.0, 0, 3, 0, 0, 0.5, None, None, None, 0, 0, 0, 0, 0, None, None, None, None, None, None, None, None, None, None, None, None, 3, 0.5, 6, 0.21699999272823334, u'0.015', 0.7990000247955322, 0, 0, 47, 0, 6.989699840545654, 5.497000217437744, u'-1.86', None, 3, 0.5, 31.399999618530273, 0, -1.1779999732971191, None, 'None', 7.78000020980835, -0.07000000029802322, u'0.012', 2.390000104904175, u'0.125,0.052,0.121,0.06', None, None, None, None, None, None, None, None, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, None, 0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, None, None, u'0.000000', u'-2.018000', u'rs2359265', 550, 55, 495, 0.015049999579787254, 0.0016953999875113368, 0.1881999969482422, 0.1881999969482422, 36544, 36544, (4698, 3254), 29, 23, 6, <read-only buffer for 0x7f1e05aa7110, size -1, offset 0 at 0x7f1e035287f0>, <read-only buffer for 0x7f1e05aa70d8, size -1, offset 0 at 0x7f1e035287b0>, <read-only buffer for 0x7f1e05b25c30, size -1, offset 0 at 0x7f1e03528770>, <read-only buffer for 0x7f1e05aa7148, size -1, offset 0 at 0x7f1e03528670>, <read-only buffer for 0x7f1e05aa7180, size -1, offset 0 at 0x7f1e03528730>, <read-only buffer for 0x7f1e05aa71b8, size -1, offset 0 at 0x7f1e035286f0>, <read-only buffer for 0x7f1e05aa71f0, size -1, offset 0 at 0x7f1e035286b0>, <read-only buffer for 0x7f1e05b249b0, size -1, offset 0 at 0x7f1e03528630>)]
(Background on this error at: http://sqlalche.me/e/rvf5)
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 188, in variant2pipeline
    samples = population.prep_db_parallel(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 403, in prep_db_parallel
    output = parallel_fn("prep_gemini_db", to_process)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 921, in __call__
    if self.dispatch_one_batch(iterator):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 549, in __init__
    self.results = batch()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 383, in prep_gemini_db
    return population.prep_gemini_db(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 50, in prep_gemini_db
    gemini_db = create_gemini_db(ann_vcf, data, gemini_db, ped_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 142, in create_gemini_db
    do.run(cmd, "GEMINI: create database with vcf2db")
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-cre.vcfanno.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpty6jb_71/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype.db
gnomad_af -1.0
gnomad_af_afr -1.0
gnomad_af_amr -1.0
gnomad_af_asj -1.0
gnomad_af_eas -1.0
gnomad_af_fin -1.0
gnomad_af_nfe -1.0
gnomad_af_oth -1.0
gnomad_af_popmax -1.0
gnomad_af_sas -1.0
gt_alt_depths i
               
�5�?lt_freqs d\(������?�?^Cy
gt_depths i
           ,
gt_phases ?
gt_quals f
          ,�B<B�B
gt_ref_depths i
               ,
                
gt_types i
          ,
gts S
     (G/AG/AG/A
impact None
impact_severity None
impact_so None
is_canonical False
is_coding False
is_exonic False
is_lof False
is_splicing False
len None
max_aaf_all -1.0
mleac 3
mleaf 0.5
mmq (27, 27)
mq 31.3999996185
mq0 0
mqranksum -1.1779999733
num_exac_Het None
num_exac_Hom None
num_exac_het None
num_exac_hom None
num_het 3
num_hom_alt 0
num_hom_ref 0
num_unknown 0
old_multiallelic None
old_variant None
phastCons20way_mammalian 0.000000
phastcons20way_mammalian 0.000000
phyloP20way_mammalian -2.018000
phylop20way_mammalian -2.018000
polyphen_pred None
polyphen_score None
qd 7.78000020981
qual 357.899993896
readposranksum -0.070000000298
ref G
revel_score 0.012
rs_ids rs2359265
sift_pred None
sift_score None
so None
sor 2.3900001049
start 13263049
sub_type ts
top_consequence None
transcript None
type snp
variant_id 1558
vcf_id rs2359265
vcfanno_gnomad_ac 550
vcfanno_gnomad_ac_es 55
vcfanno_gnomad_ac_gs 495
vcfanno_gnomad_af 0.0150499995798
vcfanno_gnomad_af_es 0.00169539998751
vcfanno_gnomad_af_gs 0.188199996948
vcfanno_gnomad_af_popmax 0.188199996948
vcfanno_gnomad_an 36544
vcfanno_gnomad_an_es 36544
vcfanno_gnomad_an_gs (4698, 3254)
vcfanno_gnomad_hom 29
vcfanno_gnomad_hom_es 23
vcfanno_gnomad_hom_gs 6
vest3_score 0.125,0.052,0.121,0.06
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 234, in __init__
    self.load()
  File "/RED/TOOLS/bin/vcf2db.py", line 319, in load
    i = self._load(self.cache, create=True, start=1)
  File "/RED/TOOLS/bin/vcf2db.py", line 312, in _load
    self.insert(variants, expanded, keys, i, create=create)
  File "/RED/TOOLS/bin/vcf2db.py", line 374, in insert
    vilengths, variant_impacts)
  File "/RED/TOOLS/bin/vcf2db.py", line 402, in _insert
    self.__insert(v_objs, self.metadata.tables['variants'].insert())
  File "/RED/TOOLS/bin/vcf2db.py", line 436, in __insert
    raise e
sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 117 - probably unsupported type.
[SQL: INSERT INTO variants (variant_id, chrom, start, "end", vcf_id, ref, alt, qual, filter, type, sub_type, call_rate, num_hom_ref, num_het, num_hom_alt, num_unknown, aaf, gene, ensembl_gene_id, transcript, is_exonic, is_coding, is_lof, is_splicing, is_canonical, exon, codon_change, aa_change, aa_length, biotype, impact, impact_so, impact_severity, polyphen_pred, polyphen_score, sift_pred, sift_score, ac, af, an, baseqranksum, cadd_phred, clippingranksum, db, decomposed, dp, ds, excesshet, fs, gerp_score, len, mleac, mleaf, mq, mq0, mqranksum, old_multiallelic, old_variant, qd, readposranksum, revel_score, sor, vest3_score, ac_adj_exac_afr, ac_adj_exac_amr, ac_adj_exac_eas, ac_adj_exac_fin, ac_adj_exac_nfe, ac_adj_exac_oth, ac_adj_exac_sas, ac_exac_all, af_adj_exac_afr, af_adj_exac_amr, af_adj_exac_eas, af_adj_exac_fin, af_adj_exac_nfe, af_adj_exac_oth, af_adj_exac_sas, af_esp_aa, af_esp_all, af_esp_ea, af_exac_all, an_adj_exac_afr, an_adj_exac_amr, an_adj_exac_eas, an_adj_exac_fin, an_adj_exac_nfe, an_adj_exac_oth, an_adj_exac_sas, an_exac_all, clinvar_pathogenic, common_pathogenic, gnomad_af, gnomad_af_afr, gnomad_af_amr, gnomad_af_asj, gnomad_af_eas, gnomad_af_fin, gnomad_af_nfe, gnomad_af_oth, gnomad_af_popmax, gnomad_af_sas, max_aaf_all, num_exac_het, num_exac_hom, phastcons20way_mammalian, phylop20way_mammalian, rs_ids, vcfanno_gnomad_ac, vcfanno_gnomad_ac_es, vcfanno_gnomad_ac_gs, vcfanno_gnomad_af, vcfanno_gnomad_af_es, vcfanno_gnomad_af_gs, vcfanno_gnomad_af_popmax, vcfanno_gnomad_an, vcfanno_gnomad_an_es, vcfanno_gnomad_an_gs, vcfanno_gnomad_hom, vcfanno_gnomad_hom_es, vcfanno_gnomad_hom_gs, gts, gt_types, gt_phases, gt_depths, gt_ref_depths, gt_alt_depths, gt_quals, gt_alt_freqs) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: (1558, u'chr1', 13263049, 13263050, u'rs2359265', u'G', u'A', 357.8999938964844, None, 'snp', 'ts', 1.0, 0, 3, 0, 0, 0.5, None, None, None, 0, 0, 0, 0, 0, None, None, None, None, None, None, None, None, None, None, None, None, 3, 0.5, 6, 0.21699999272823334, u'0.015', 0.7990000247955322, 0, 0, 47, 0, 6.989699840545654, 5.497000217437744, u'-1.86', None, 3, 0.5, 31.399999618530273, 0, -1.1779999732971191, None, 'None', 7.78000020980835, -0.07000000029802322, u'0.012', 2.390000104904175, u'0.125,0.052,0.121,0.06', None, None, None, None, None, None, None, None, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, None, 0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, None, None, u'0.000000', u'-2.018000', u'rs2359265', 550, 55, 495, 0.015049999579787254, 0.0016953999875113368, 0.1881999969482422, 0.1881999969482422, 36544, 36544, (4698, 3254), 29, 23, 6, <read-only buffer for 0x7f1e05aa7110, size -1, offset 0 at 0x7f1e035287f0>, <read-only buffer for 0x7f1e05aa70d8, size -1, offset 0 at 0x7f1e035287b0>, <read-only buffer for 0x7f1e05b25c30, size -1, offset 0 at 0x7f1e03528770>, <read-only buffer for 0x7f1e05aa7148, size -1, offset 0 at 0x7f1e03528670>, <read-only buffer for 0x7f1e05aa7180, size -1, offset 0 at 0x7f1e03528730>, <read-only buffer for 0x7f1e05aa71b8, size -1, offset 0 at 0x7f1e035286f0>, <read-only buffer for 0x7f1e05aa71f0, size -1, offset 0 at 0x7f1e035286b0>, <read-only buffer for 0x7f1e05b249b0, size -1, offset 0 at 0x7f1e03528630>)]
(Background on this error at: http://sqlalche.me/e/rvf5)
' returned non-zero exit status 1.
roryk commented 5 years ago

Thanks for investigating, sorry for being slow about getting back to you.

I think the original problem is with the format of your PED file. I'm guessing that the samplenames don't match what is in the PED file. Do your samplenames get swapped to having an X in front of them during the bcbio run? We try to make everything compatible downstream with R, since lots of folks use it for analysis, and R won't allow you have column names that start with a number. I'm guessing that is the problem with the PED file.

roryk commented 5 years ago

Is the last run from your vcfanno file without a PED file? I think your vcfanno file is probably not doing the right thing, if so. Do you need to use a custom file? Can you just use the annotations bcbio provides?

kokyriakidis commented 5 years ago

The previous run was WITH VCFANNO file and WITHOUT PED file.

I run it again WITHOUT PED OR VCFANNO FILE with the following template

details:
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
  analysis: variant2
  description: 178FCHILD
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_CHILD_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_CHILD_2.fq.gz
  genome_build: hg38
  metadata:
    batch: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F
    sex: female
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
  analysis: variant2
  description: 178FFATHER
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_FATHER_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_FATHER_2.fq.gz
  genome_build: hg38
  metadata:
    batch: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F
    sex: male
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
  analysis: variant2
  description: 178FMOTHER
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_MOTHER_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_MOTHER_2.fq.gz
  genome_build: hg38
  metadata:
    batch: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F
    sex: female
fc_name: 178F
upload:
  dir: ../final

I get these errors:

[2019-10-05T15:00Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-decompose.vcf.gz
[2019-10-05T15:00Z] Annotating /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic.vcf.gz with vcfanno, using /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-gemini-combine.conf
[2019-10-05T15:00Z] =============================================
[2019-10-05T15:00Z] vcfanno version 0.3.2 [built with go1.12.1]
[2019-10-05T15:00Z] see: https://github.com/brentp/vcfanno
[2019-10-05T15:00Z] =============================================
[2019-10-05T15:00Z] vcfanno.go:115: found 61 sources from 5 files
[2019-10-05T15:00Z] vcfanno.go:145: using 2 worker threads to decompress bgzip file
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AFR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ACR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ANR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Male' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Female' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] vcfanno.go:194: Info Error: max_aaf_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T15:00Z] vcfanno.go:194: Info Error: Hom_Female not found in header >> this error/warning may occur many times. reporting once here...
[2019-10-05T15:00Z] vcfanno.go:194: Info Error: clinvar_sig not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T15:00Z] vcfanno.go:194: Info Error: af_adj_exac_sas not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T15:00Z] vcfanno.go:194: Info Error: af_esp_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T15:00Z] vcfanno.go:194: Info Error: CLNDN not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T15:05Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/exac.vcf.gz
[2019-10-05T15:05Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/esp.vcf.gz
[2019-10-05T15:05Z] vcfanno.go:248: annotated 148115 variants in 274.66 seconds (539.3 / second)
[2019-10-05T15:05Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz
[2019-10-05T15:05Z] GEMINI: create database with vcf2db
[2019-10-05T15:05Z] skipping 'MMQ' because it has Number=R
[2019-10-05T15:05Z] /RED/RESOURCES/bcbio/anaconda/envs/python2/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value '_mnt_36642bae-9ec9-4100-8...'. (this warning may be suppressed after 10 occurrences)
[2019-10-05T15:05Z]   (util.ellipses_string(value),),
[2019-10-05T15:05Z] /RED/RESOURCES/bcbio/anaconda/envs/python2/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value '-9'. (this warning may be suppressed after 10 occurrences)
[2019-10-05T15:05Z]   (util.ellipses_string(value),),
[2019-10-05T15:05Z] /RED/RESOURCES/bcbio/anaconda/envs/python2/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value '2'. (this warning may be suppressed after 10 occurrences)
[2019-10-05T15:05Z]   (util.ellipses_string(value),),
[2019-10-05T15:05Z] bad record:
[2019-10-05T15:05Z] AC 2
[2019-10-05T15:05Z] AF 0.333000004292
[2019-10-05T15:05Z] AN 6
[2019-10-05T15:05Z] BaseQRankSum -1.42299997807
[2019-10-05T15:05Z] ClippingRankSum 0.736999988556
[2019-10-05T15:05Z] DB None
[2019-10-05T15:05Z] DECOMPOSED None
[2019-10-05T15:05Z] DP 49
[2019-10-05T15:05Z] ExcessHet 3.97939991951
[2019-10-05T15:05Z] FS 0.0
[2019-10-05T15:05Z] LEN None
[2019-10-05T15:05Z] MLEAC 2
[2019-10-05T15:05Z] MLEAF 0.333000004292
[2019-10-05T15:05Z] MMQ (40, 40)
[2019-10-05T15:05Z] MQ 37.5800018311
[2019-10-05T15:05Z] MQ0 0
[2019-10-05T15:05Z] MQRankSum -2.37800002098
[2019-10-05T15:05Z] OLD_MULTIALLELIC None
[2019-10-05T15:05Z] OLD_VARIANT None
[2019-10-05T15:05Z] QD 7.92000007629
[2019-10-05T15:05Z] ReadPosRankSum -1.04200005531
[2019-10-05T15:05Z] SOR 0.675000011921
[2019-10-05T15:05Z] TYPE None
[2019-10-05T15:05Z] aa_change None
[2019-10-05T15:05Z] aa_length None
[2019-10-05T15:05Z] aaf 0.333333333333
[2019-10-05T15:05Z] ac 2
[2019-10-05T15:05Z] ac_adj_exac_afr 0.0
[2019-10-05T15:05Z] ac_adj_exac_amr 2.0
[2019-10-05T15:05Z] ac_adj_exac_eas 0.0
[2019-10-05T15:05Z] ac_adj_exac_fin 0.0
[2019-10-05T15:05Z] ac_adj_exac_nfe 0.0
[2019-10-05T15:05Z] ac_adj_exac_oth 0.0
[2019-10-05T15:05Z] ac_adj_exac_sas 0.0
[2019-10-05T15:05Z] ac_exac_all 2.0
[2019-10-05T15:05Z] af 0.333000004292
[2019-10-05T15:05Z] af_adj_exac_afr 0.0
[2019-10-05T15:05Z] af_adj_exac_amr 0.000295420002658
[2019-10-05T15:05Z] af_adj_exac_eas 0.0
[2019-10-05T15:05Z] af_adj_exac_fin 0.0
[2019-10-05T15:05Z] af_adj_exac_nfe 0.0
[2019-10-05T15:05Z] af_adj_exac_oth 0.0
[2019-10-05T15:05Z] af_adj_exac_sas 0.0
[2019-10-05T15:05Z] af_esp_aa -1.0
[2019-10-05T15:05Z] af_esp_all -1.0
[2019-10-05T15:05Z] af_esp_ea -1.0
[2019-10-05T15:05Z] af_exac_all 5.88199982303e-05
[2019-10-05T15:05Z] alt A
[2019-10-05T15:05Z] an 6
[2019-10-05T15:05Z] an_adj_exac_afr 1262.0
[2019-10-05T15:05Z] an_adj_exac_amr 6770.0
[2019-10-05T15:05Z] an_adj_exac_eas 2486.0
[2019-10-05T15:05Z] an_adj_exac_fin 1828.0
[2019-10-05T15:05Z] an_adj_exac_nfe 16896.0
[2019-10-05T15:05Z] an_adj_exac_oth 240.0
[2019-10-05T15:05Z] an_adj_exac_sas 4520.0
[2019-10-05T15:05Z] an_exac_all 34002.0
[2019-10-05T15:05Z] baseqranksum -1.42299997807
[2019-10-05T15:05Z] biotype None
[2019-10-05T15:05Z] call_rate 1.0
[2019-10-05T15:05Z] chrom chr1
[2019-10-05T15:05Z] clinvar_disease_name None
[2019-10-05T15:05Z] clinvar_sig None
[2019-10-05T15:05Z] clippingranksum 0.736999988556
[2019-10-05T15:05Z] codon_change None
[2019-10-05T15:05Z] common_pathogenic False
[2019-10-05T15:05Z] db False
[2019-10-05T15:05Z] decomposed False
[2019-10-05T15:05Z] dp 49
[2019-10-05T15:05Z] ds False
[2019-10-05T15:05Z] effect_severity None
[2019-10-05T15:05Z] end 146984765
[2019-10-05T15:05Z] ensembl_gene_id None
[2019-10-05T15:05Z] excesshet 3.97939991951
[2019-10-05T15:05Z] exon None
[2019-10-05T15:05Z] filter None
[2019-10-05T15:05Z] fs 0.0
[2019-10-05T15:05Z] gene None
[2019-10-05T15:05Z] gnomAD_AC 8
[2019-10-05T15:05Z] gnomAD_AF 0.000136870003189
[2019-10-05T15:05Z] gnomAD_AN (79506, 58450)
[2019-10-05T15:05Z] gnomad_ac 8
[2019-10-05T15:05Z] gnomad_af 0.000136870003189
[2019-10-05T15:05Z] gnomad_af_afr -1.0
[2019-10-05T15:05Z] gnomad_af_amr -1.0
[2019-10-05T15:05Z] gnomad_af_asj -1.0
[2019-10-05T15:05Z] gnomad_af_eas -1.0
[2019-10-05T15:05Z] gnomad_af_fin -1.0
[2019-10-05T15:05Z] gnomad_af_nfe -1.0
[2019-10-05T15:05Z] gnomad_af_oth -1.0
[2019-10-05T15:05Z] gnomad_af_popmax -1.0
[2019-10-05T15:05Z] gnomad_af_sas -1.0
[2019-10-05T15:05Z] gnomad_an (79506, 58450)
[2019-10-05T15:05Z] gt_alt_depths i
                                   ,
                                    
[2019-10-05T15:05Z] gt_alt_freqs dUD�?�q�q�?
[2019-10-05T15:05Z] gt_depths i
                               ,$  
[2019-10-05T15:05Z] gt_phases ?
[2019-10-05T15:05Z] gt_quals f
                              ,�B@A0A
[2019-10-05T15:05Z] gt_ref_depths i
                                   ,
[2019-10-05T15:05Z] gt_types i
                              ,
[2019-10-05T15:05Z] gts S
                         (G/AG/GG/A
[2019-10-05T15:05Z] impact None
[2019-10-05T15:05Z] impact_severity None
[2019-10-05T15:05Z] impact_so None
[2019-10-05T15:05Z] is_canonical False
[2019-10-05T15:05Z] is_coding False
[2019-10-05T15:05Z] is_exonic False
[2019-10-05T15:05Z] is_lof False
[2019-10-05T15:05Z] is_splicing False
[2019-10-05T15:05Z] len None
[2019-10-05T15:05Z] max_aaf_all 0.000295420002658
[2019-10-05T15:05Z] mleac 2
[2019-10-05T15:05Z] mleaf 0.333000004292
[2019-10-05T15:05Z] mmq (40, 40)
[2019-10-05T15:05Z] mq 37.5800018311
[2019-10-05T15:05Z] mq0 0
[2019-10-05T15:05Z] mqranksum -2.37800002098
[2019-10-05T15:05Z] num_exac_Het 2.0
[2019-10-05T15:05Z] num_exac_Hom 0.0
[2019-10-05T15:05Z] num_exac_het 2.0
[2019-10-05T15:05Z] num_exac_hom 0.0
[2019-10-05T15:05Z] num_het 2
[2019-10-05T15:05Z] num_hom_alt 0
[2019-10-05T15:05Z] num_hom_ref 1
[2019-10-05T15:05Z] num_unknown 0
[2019-10-05T15:05Z] old_multiallelic None
[2019-10-05T15:05Z] old_variant None
[2019-10-05T15:05Z] polyphen_pred None
[2019-10-05T15:05Z] polyphen_score None
[2019-10-05T15:05Z] qd 7.92000007629
[2019-10-05T15:05Z] qual 285.200012207
[2019-10-05T15:05Z] readposranksum -1.04200005531
[2019-10-05T15:05Z] ref G
[2019-10-05T15:05Z] rs_ids rs1437329596
[2019-10-05T15:05Z] sift_pred None
[2019-10-05T15:05Z] sift_score None
[2019-10-05T15:05Z] so None
[2019-10-05T15:05Z] sor 0.675000011921
[2019-10-05T15:05Z] start 146984764
[2019-10-05T15:05Z] sub_type ts
[2019-10-05T15:05Z] top_consequence None
[2019-10-05T15:05Z] transcript None
[2019-10-05T15:05Z] type snp
[2019-10-05T15:05Z] variant_id 7671
[2019-10-05T15:05Z] vcf_id rs1437329596
[2019-10-05T15:05Z] Traceback (most recent call last):
[2019-10-05T15:05Z]   File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
[2019-10-05T15:05Z]     impacts_extras=a.impacts_field, aok=a.a_ok)
[2019-10-05T15:05Z]   File "/RED/TOOLS/bin/vcf2db.py", line 234, in __init__
[2019-10-05T15:05Z]     self.load()
[2019-10-05T15:05Z]   File "/RED/TOOLS/bin/vcf2db.py", line 319, in load
[2019-10-05T15:05Z]     i = self._load(self.cache, create=True, start=1)
[2019-10-05T15:05Z]   File "/RED/TOOLS/bin/vcf2db.py", line 312, in _load
[2019-10-05T15:05Z]     self.insert(variants, expanded, keys, i, create=create)
[2019-10-05T15:05Z]   File "/RED/TOOLS/bin/vcf2db.py", line 374, in insert
[2019-10-05T15:05Z]     vilengths, variant_impacts)
[2019-10-05T15:05Z]   File "/RED/TOOLS/bin/vcf2db.py", line 402, in _insert
[2019-10-05T15:05Z]     self.__insert(v_objs, self.metadata.tables['variants'].insert())
[2019-10-05T15:05Z]   File "/RED/TOOLS/bin/vcf2db.py", line 436, in __insert
[2019-10-05T15:05Z]     raise e
[2019-10-05T15:05Z] sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 100 - probably unsupported type.
[2019-10-05T15:05Z] [SQL: INSERT INTO variants (variant_id, chrom, start, "end", vcf_id, ref, alt, qual, filter, type, sub_type, call_rate, num_hom_ref, num_het, num_hom_alt, num_unknown, aaf, gene, ensembl_gene_id, transcript, is_exonic, is_coding, is_lof, is_splicing, is_canonical, exon, codon_change, aa_change, aa_length, biotype, impact, impact_so, impact_severity, polyphen_pred, polyphen_score, sift_pred, sift_score, ac, af, an, baseqranksum, clippingranksum, db, decomposed, dp, ds, excesshet, fs, len, mleac, mleaf, mq, mq0, mqranksum, old_multiallelic, old_variant, qd, readposranksum, sor, ac_adj_exac_afr, ac_adj_exac_amr, ac_adj_exac_eas, ac_adj_exac_fin, ac_adj_exac_nfe, ac_adj_exac_oth, ac_adj_exac_sas, ac_exac_all, af_adj_exac_afr, af_adj_exac_amr, af_adj_exac_eas, af_adj_exac_fin, af_adj_exac_nfe, af_adj_exac_oth, af_adj_exac_sas, af_esp_aa, af_esp_all, af_esp_ea, af_exac_all, an_adj_exac_afr, an_adj_exac_amr, an_adj_exac_eas, an_adj_exac_fin, an_adj_exac_nfe, an_adj_exac_oth, an_adj_exac_sas, an_exac_all, clinvar_disease_name, clinvar_sig, common_pathogenic, gnomad_ac, gnomad_af, gnomad_af_afr, gnomad_af_amr, gnomad_af_asj, gnomad_af_eas, gnomad_af_fin, gnomad_af_nfe, gnomad_af_oth, gnomad_af_popmax, gnomad_af_sas, gnomad_an, max_aaf_all, num_exac_het, num_exac_hom, rs_ids, gts, gt_types, gt_phases, gt_depths, gt_ref_depths, gt_alt_depths, gt_quals, gt_alt_freqs) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[2019-10-05T15:05Z] [parameters: (7671, u'chr1', 146984764, 146984765, u'rs1437329596', u'G', u'A', 285.20001220703125, None, 'snp', 'ts', 1.0, 1, 2, 0, 0, 0.3333333333333333, None, None, None, 0, 0, 0, 0, 0, None, None, None, None, None, None, None, None, None, None, None, None, 2, 0.3330000042915344, 6, -1.4229999780654907, 0.7369999885559082, 0, 0, 49, 0, 3.9793999195098877, 0.0, None, 2, 0.3330000042915344, 37.58000183105469, 0, -2.378000020980835, None, 'None', 7.920000076293945, -1.0420000553131104, 0.675000011920929, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0002954200026579201, 0.0, 0.0, 0.0, 0.0, 0.0, -1.0, -1.0, -1.0, 5.881999823031947e-05, 1262.0, 6770.0, 2486.0, 1828.0, 16896.0, 240.0, 4520.0, 34002.0, None, None, 0, 8, 0.00013687000318896025, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, (79506, 58450), 0.0002954200026579201, 2.0, 0.0, u'rs1437329596', <read-only buffer for 0x7f95d163f148, size -1, offset 0 at 0x7f95d0b01530>, <read-only buffer for 0x7f95d163f110, size -1, offset 0 at 0x7f95d0b014f0>, <read-only buffer for 0x7f95d163de70, size -1, offset 0 at 0x7f95d0b01130>, <read-only buffer for 0x7f95d163f180, size -1, offset 0 at 0x7f95d0b010f0>, <read-only buffer for 0x7f95d163f1b8, size -1, offset 0 at 0x7f95d0b010b0>, <read-only buffer for 0x7f95d163f1f0, size -1, offset 0 at 0x7f95d0b01070>, <read-only buffer for 0x7f95d163f228, size -1, offset 0 at 0x7f95d0b01030>, <read-only buffer for 0x7f95d163e4b0, size -1, offset 0 at 0x7f95d0b01770>)]
[2019-10-05T15:05Z] (Background on this error at: http://sqlalche.me/e/rvf5)
[2019-10-05T15:05Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpv18n_zur/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype.db
codon_change None
common_pathogenic False
db False
decomposed False
dp 49
ds False
effect_severity None
end 146984765
ensembl_gene_id None
excesshet 3.97939991951
exon None
filter None
fs 0.0
gene None
gnomAD_AC 8
gnomAD_AF 0.000136870003189
gnomAD_AN (79506, 58450)
gnomad_ac 8
gnomad_af 0.000136870003189
gnomad_af_afr -1.0
gnomad_af_amr -1.0
gnomad_af_asj -1.0
gnomad_af_eas -1.0
gnomad_af_fin -1.0
gnomad_af_nfe -1.0
gnomad_af_oth -1.0
gnomad_af_popmax -1.0
gnomad_af_sas -1.0
gnomad_an (79506, 58450)
gt_alt_depths i
               ,
                
gt_alt_freqs dUD�?�q�q�?
gt_depths i
           ,$  
gt_phases ?
gt_quals f
          ,�B@A0A
gt_ref_depths i
               ,
gt_types i
          ,
gts S
     (G/AG/GG/A
impact None
impact_severity None
impact_so None
is_canonical False
is_coding False
is_exonic False
is_lof False
is_splicing False
len None
max_aaf_all 0.000295420002658
mleac 2
mleaf 0.333000004292
mmq (40, 40)
mq 37.5800018311
mq0 0
mqranksum -2.37800002098
num_exac_Het 2.0
num_exac_Hom 0.0
num_exac_het 2.0
num_exac_hom 0.0
num_het 2
num_hom_alt 0
num_hom_ref 1
num_unknown 0
old_multiallelic None
old_variant None
polyphen_pred None
polyphen_score None
qd 7.92000007629
qual 285.200012207
readposranksum -1.04200005531
ref G
rs_ids rs1437329596
sift_pred None
sift_score None
so None
sor 0.675000011921
start 146984764
sub_type ts
top_consequence None
transcript None
type snp
variant_id 7671
vcf_id rs1437329596
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 234, in __init__
    self.load()
  File "/RED/TOOLS/bin/vcf2db.py", line 319, in load
    i = self._load(self.cache, create=True, start=1)
  File "/RED/TOOLS/bin/vcf2db.py", line 312, in _load
    self.insert(variants, expanded, keys, i, create=create)
  File "/RED/TOOLS/bin/vcf2db.py", line 374, in insert
    vilengths, variant_impacts)
  File "/RED/TOOLS/bin/vcf2db.py", line 402, in _insert
    self.__insert(v_objs, self.metadata.tables['variants'].insert())
  File "/RED/TOOLS/bin/vcf2db.py", line 436, in __insert
    raise e
sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 100 - probably unsupported type.
[SQL: INSERT INTO variants (variant_id, chrom, start, "end", vcf_id, ref, alt, qual, filter, type, sub_type, call_rate, num_hom_ref, num_het, num_hom_alt, num_unknown, aaf, gene, ensembl_gene_id, transcript, is_exonic, is_coding, is_lof, is_splicing, is_canonical, exon, codon_change, aa_change, aa_length, biotype, impact, impact_so, impact_severity, polyphen_pred, polyphen_score, sift_pred, sift_score, ac, af, an, baseqranksum, clippingranksum, db, decomposed, dp, ds, excesshet, fs, len, mleac, mleaf, mq, mq0, mqranksum, old_multiallelic, old_variant, qd, readposranksum, sor, ac_adj_exac_afr, ac_adj_exac_amr, ac_adj_exac_eas, ac_adj_exac_fin, ac_adj_exac_nfe, ac_adj_exac_oth, ac_adj_exac_sas, ac_exac_all, af_adj_exac_afr, af_adj_exac_amr, af_adj_exac_eas, af_adj_exac_fin, af_adj_exac_nfe, af_adj_exac_oth, af_adj_exac_sas, af_esp_aa, af_esp_all, af_esp_ea, af_exac_all, an_adj_exac_afr, an_adj_exac_amr, an_adj_exac_eas, an_adj_exac_fin, an_adj_exac_nfe, an_adj_exac_oth, an_adj_exac_sas, an_exac_all, clinvar_disease_name, clinvar_sig, common_pathogenic, gnomad_ac, gnomad_af, gnomad_af_afr, gnomad_af_amr, gnomad_af_asj, gnomad_af_eas, gnomad_af_fin, gnomad_af_nfe, gnomad_af_oth, gnomad_af_popmax, gnomad_af_sas, gnomad_an, max_aaf_all, num_exac_het, num_exac_hom, rs_ids, gts, gt_types, gt_phases, gt_depths, gt_ref_depths, gt_alt_depths, gt_quals, gt_alt_freqs) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: (7671, u'chr1', 146984764, 146984765, u'rs1437329596', u'G', u'A', 285.20001220703125, None, 'snp', 'ts', 1.0, 1, 2, 0, 0, 0.3333333333333333, None, None, None, 0, 0, 0, 0, 0, None, None, None, None, None, None, None, None, None, None, None, None, 2, 0.3330000042915344, 6, -1.4229999780654907, 0.7369999885559082, 0, 0, 49, 0, 3.9793999195098877, 0.0, None, 2, 0.3330000042915344, 37.58000183105469, 0, -2.378000020980835, None, 'None', 7.920000076293945, -1.0420000553131104, 0.675000011920929, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0002954200026579201, 0.0, 0.0, 0.0, 0.0, 0.0, -1.0, -1.0, -1.0, 5.881999823031947e-05, 1262.0, 6770.0, 2486.0, 1828.0, 16896.0, 240.0, 4520.0, 34002.0, None, None, 0, 8, 0.00013687000318896025, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, (79506, 58450), 0.0002954200026579201, 2.0, 0.0, u'rs1437329596', <read-only buffer for 0x7f95d163f148, size -1, offset 0 at 0x7f95d0b01530>, <read-only buffer for 0x7f95d163f110, size -1, offset 0 at 0x7f95d0b014f0>, <read-only buffer for 0x7f95d163de70, size -1, offset 0 at 0x7f95d0b01130>, <read-only buffer for 0x7f95d163f180, size -1, offset 0 at 0x7f95d0b010f0>, <read-only buffer for 0x7f95d163f1b8, size -1, offset 0 at 0x7f95d0b010b0>, <read-only buffer for 0x7f95d163f1f0, size -1, offset 0 at 0x7f95d0b01070>, <read-only buffer for 0x7f95d163f228, size -1, offset 0 at 0x7f95d0b01030>, <read-only buffer for 0x7f95d163e4b0, size -1, offset 0 at 0x7f95d0b01770>)]
(Background on this error at: http://sqlalche.me/e/rvf5)
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 188, in variant2pipeline
    samples = population.prep_db_parallel(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 403, in prep_db_parallel
    output = parallel_fn("prep_gemini_db", to_process)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 921, in __call__
    if self.dispatch_one_batch(iterator):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 549, in __init__
    self.results = batch()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 383, in prep_gemini_db
    return population.prep_gemini_db(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 50, in prep_gemini_db
    gemini_db = create_gemini_db(ann_vcf, data, gemini_db, ped_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 142, in create_gemini_db
    do.run(cmd, "GEMINI: create database with vcf2db")
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpv18n_zur/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype.db
codon_change None
common_pathogenic False
db False
decomposed False
dp 49
ds False
effect_severity None
end 146984765
ensembl_gene_id None
excesshet 3.97939991951
exon None
filter None
fs 0.0
gene None
gnomAD_AC 8
gnomAD_AF 0.000136870003189
gnomAD_AN (79506, 58450)
gnomad_ac 8
gnomad_af 0.000136870003189
gnomad_af_afr -1.0
gnomad_af_amr -1.0
gnomad_af_asj -1.0
gnomad_af_eas -1.0
gnomad_af_fin -1.0
gnomad_af_nfe -1.0
gnomad_af_oth -1.0
gnomad_af_popmax -1.0
gnomad_af_sas -1.0
gnomad_an (79506, 58450)
gt_alt_depths i
               ,
                
gt_alt_freqs dUD�?�q�q�?
gt_depths i
           ,$  
gt_phases ?
gt_quals f
          ,�B@A0A
gt_ref_depths i
               ,
gt_types i
          ,
gts S
     (G/AG/GG/A
impact None
impact_severity None
impact_so None
is_canonical False
is_coding False
is_exonic False
is_lof False
is_splicing False
len None
max_aaf_all 0.000295420002658
mleac 2
mleaf 0.333000004292
mmq (40, 40)
mq 37.5800018311
mq0 0
mqranksum -2.37800002098
num_exac_Het 2.0
num_exac_Hom 0.0
num_exac_het 2.0
num_exac_hom 0.0
num_het 2
num_hom_alt 0
num_hom_ref 1
num_unknown 0
old_multiallelic None
old_variant None
polyphen_pred None
polyphen_score None
qd 7.92000007629
qual 285.200012207
readposranksum -1.04200005531
ref G
rs_ids rs1437329596
sift_pred None
sift_score None
so None
sor 0.675000011921
start 146984764
sub_type ts
top_consequence None
transcript None
type snp
variant_id 7671
vcf_id rs1437329596
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 234, in __init__
    self.load()
  File "/RED/TOOLS/bin/vcf2db.py", line 319, in load
    i = self._load(self.cache, create=True, start=1)
  File "/RED/TOOLS/bin/vcf2db.py", line 312, in _load
    self.insert(variants, expanded, keys, i, create=create)
  File "/RED/TOOLS/bin/vcf2db.py", line 374, in insert
    vilengths, variant_impacts)
  File "/RED/TOOLS/bin/vcf2db.py", line 402, in _insert
    self.__insert(v_objs, self.metadata.tables['variants'].insert())
  File "/RED/TOOLS/bin/vcf2db.py", line 436, in __insert
    raise e
sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 100 - probably unsupported type.
[SQL: INSERT INTO variants (variant_id, chrom, start, "end", vcf_id, ref, alt, qual, filter, type, sub_type, call_rate, num_hom_ref, num_het, num_hom_alt, num_unknown, aaf, gene, ensembl_gene_id, transcript, is_exonic, is_coding, is_lof, is_splicing, is_canonical, exon, codon_change, aa_change, aa_length, biotype, impact, impact_so, impact_severity, polyphen_pred, polyphen_score, sift_pred, sift_score, ac, af, an, baseqranksum, clippingranksum, db, decomposed, dp, ds, excesshet, fs, len, mleac, mleaf, mq, mq0, mqranksum, old_multiallelic, old_variant, qd, readposranksum, sor, ac_adj_exac_afr, ac_adj_exac_amr, ac_adj_exac_eas, ac_adj_exac_fin, ac_adj_exac_nfe, ac_adj_exac_oth, ac_adj_exac_sas, ac_exac_all, af_adj_exac_afr, af_adj_exac_amr, af_adj_exac_eas, af_adj_exac_fin, af_adj_exac_nfe, af_adj_exac_oth, af_adj_exac_sas, af_esp_aa, af_esp_all, af_esp_ea, af_exac_all, an_adj_exac_afr, an_adj_exac_amr, an_adj_exac_eas, an_adj_exac_fin, an_adj_exac_nfe, an_adj_exac_oth, an_adj_exac_sas, an_exac_all, clinvar_disease_name, clinvar_sig, common_pathogenic, gnomad_ac, gnomad_af, gnomad_af_afr, gnomad_af_amr, gnomad_af_asj, gnomad_af_eas, gnomad_af_fin, gnomad_af_nfe, gnomad_af_oth, gnomad_af_popmax, gnomad_af_sas, gnomad_an, max_aaf_all, num_exac_het, num_exac_hom, rs_ids, gts, gt_types, gt_phases, gt_depths, gt_ref_depths, gt_alt_depths, gt_quals, gt_alt_freqs) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: (7671, u'chr1', 146984764, 146984765, u'rs1437329596', u'G', u'A', 285.20001220703125, None, 'snp', 'ts', 1.0, 1, 2, 0, 0, 0.3333333333333333, None, None, None, 0, 0, 0, 0, 0, None, None, None, None, None, None, None, None, None, None, None, None, 2, 0.3330000042915344, 6, -1.4229999780654907, 0.7369999885559082, 0, 0, 49, 0, 3.9793999195098877, 0.0, None, 2, 0.3330000042915344, 37.58000183105469, 0, -2.378000020980835, None, 'None', 7.920000076293945, -1.0420000553131104, 0.675000011920929, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0002954200026579201, 0.0, 0.0, 0.0, 0.0, 0.0, -1.0, -1.0, -1.0, 5.881999823031947e-05, 1262.0, 6770.0, 2486.0, 1828.0, 16896.0, 240.0, 4520.0, 34002.0, None, None, 0, 8, 0.00013687000318896025, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, (79506, 58450), 0.0002954200026579201, 2.0, 0.0, u'rs1437329596', <read-only buffer for 0x7f95d163f148, size -1, offset 0 at 0x7f95d0b01530>, <read-only buffer for 0x7f95d163f110, size -1, offset 0 at 0x7f95d0b014f0>, <read-only buffer for 0x7f95d163de70, size -1, offset 0 at 0x7f95d0b01130>, <read-only buffer for 0x7f95d163f180, size -1, offset 0 at 0x7f95d0b010f0>, <read-only buffer for 0x7f95d163f1b8, size -1, offset 0 at 0x7f95d0b010b0>, <read-only buffer for 0x7f95d163f1f0, size -1, offset 0 at 0x7f95d0b01070>, <read-only buffer for 0x7f95d163f228, size -1, offset 0 at 0x7f95d0b01030>, <read-only buffer for 0x7f95d163e4b0, size -1, offset 0 at 0x7f95d0b01770>)]
(Background on this error at: http://sqlalche.me/e/rvf5)
' returned non-zero exit status 1.
roryk commented 5 years ago

Thanks, sorry for all the back and forth. Could you send me a snippet of the VCF file that has this error so I can take a look? It looks like you might be able to subset it down to just rs1437329596 to get the error to be reproducible. It would be helpful if you could include a few variants that don't fail, if this wasn't the first variant as well.

/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/test.db

is the command that is failing. If you replace

 /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz

with your snippet, we should be able to see if the snippet fails, which would work as a test case.

kokyriakidis commented 5 years ago

First I will try to run it again with no vcfanno file or ped file BUT with sample names and description not starting with number and I will update you

roryk commented 5 years ago

Ok!

kokyriakidis commented 5 years ago

I got the same errors again when running with NO VCFANNO AND NO PED FILE AND SAMPLE AND DESCRIPTION NAMES NOT STARTING WITH NUMBERS:

[2019-10-05T20:07Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-decompose.vcf.gz
[2019-10-05T20:07Z] Annotating /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.vcf.gz with vcfanno, using /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini-combine.conf
[2019-10-05T20:07Z] =============================================
[2019-10-05T20:07Z] vcfanno version 0.3.2 [built with go1.12.1]
[2019-10-05T20:07Z] see: https://github.com/brentp/vcfanno
[2019-10-05T20:07Z] =============================================
[2019-10-05T20:07Z] vcfanno.go:115: found 61 sources from 5 files
[2019-10-05T20:07Z] vcfanno.go:145: using 2 worker threads to decompress bgzip file
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AFR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ACR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ANR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Male' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Female' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] vcfanno.go:194: Info Error: max_aaf_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T20:07Z] vcfanno.go:194: Info Error: Hom_Female not found in header >> this error/warning may occur many times. reporting once here...
[2019-10-05T20:07Z] vcfanno.go:194: Info Error: clinvar_sig not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T20:07Z] vcfanno.go:194: Info Error: af_adj_exac_sas not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T20:07Z] vcfanno.go:194: Info Error: af_esp_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T20:07Z] vcfanno.go:194: Info Error: CLNDN not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T20:11Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/exac.vcf.gz
[2019-10-05T20:11Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/esp.vcf.gz
[2019-10-05T20:11Z] vcfanno.go:248: annotated 148115 variants in 275.74 seconds (537.2 / second)
[2019-10-05T20:11Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz
[2019-10-05T20:11Z] GEMINI: create database with vcf2db
[2019-10-05T20:11Z] skipping 'MMQ' because it has Number=R
[2019-10-05T20:11Z] /RED/RESOURCES/bcbio/anaconda/envs/python2/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value '_mnt_36642bae-9ec9-4100-8...'. (this warning may be suppressed after 10 occurrences)
[2019-10-05T20:11Z]   (util.ellipses_string(value),),
[2019-10-05T20:11Z] /RED/RESOURCES/bcbio/anaconda/envs/python2/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value '-9'. (this warning may be suppressed after 10 occurrences)
[2019-10-05T20:11Z]   (util.ellipses_string(value),),
[2019-10-05T20:11Z] /RED/RESOURCES/bcbio/anaconda/envs/python2/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value '2'. (this warning may be suppressed after 10 occurrences)
[2019-10-05T20:11Z]   (util.ellipses_string(value),),
[2019-10-05T20:11Z] bad record:
[2019-10-05T20:11Z] AC 2
[2019-10-05T20:11Z] AF 0.333000004292
[2019-10-05T20:11Z] AN 6
[2019-10-05T20:11Z] BaseQRankSum -1.42299997807
[2019-10-05T20:11Z] ClippingRankSum 0.736999988556
[2019-10-05T20:11Z] DB None
[2019-10-05T20:11Z] DECOMPOSED None
[2019-10-05T20:11Z] DP 49
[2019-10-05T20:11Z] ExcessHet 3.97939991951
[2019-10-05T20:11Z] FS 0.0
[2019-10-05T20:11Z] LEN None
[2019-10-05T20:11Z] MLEAC 2
[2019-10-05T20:11Z] MLEAF 0.333000004292
[2019-10-05T20:11Z] MMQ (40, 40)
[2019-10-05T20:11Z] MQ 37.5800018311
[2019-10-05T20:11Z] MQ0 0
[2019-10-05T20:11Z] MQRankSum -2.37800002098
[2019-10-05T20:11Z] OLD_MULTIALLELIC None
[2019-10-05T20:11Z] OLD_VARIANT None
[2019-10-05T20:11Z] QD 7.92000007629
[2019-10-05T20:11Z] ReadPosRankSum -1.04200005531
[2019-10-05T20:11Z] SOR 0.675000011921
[2019-10-05T20:11Z] TYPE None
[2019-10-05T20:11Z] aa_change None
[2019-10-05T20:11Z] aa_length None
[2019-10-05T20:11Z] aaf 0.333333333333
[2019-10-05T20:11Z] ac 2
[2019-10-05T20:11Z] ac_adj_exac_afr 0.0
[2019-10-05T20:11Z] ac_adj_exac_amr 2.0
[2019-10-05T20:11Z] ac_adj_exac_eas 0.0
[2019-10-05T20:11Z] ac_adj_exac_fin 0.0
[2019-10-05T20:11Z] ac_adj_exac_nfe 0.0
[2019-10-05T20:11Z] ac_adj_exac_oth 0.0
[2019-10-05T20:11Z] ac_adj_exac_sas 0.0
[2019-10-05T20:11Z] ac_exac_all 2.0
[2019-10-05T20:11Z] af 0.333000004292
[2019-10-05T20:11Z] af_adj_exac_afr 0.0
[2019-10-05T20:11Z] af_adj_exac_amr 0.000295420002658
[2019-10-05T20:11Z] af_adj_exac_eas 0.0
[2019-10-05T20:11Z] af_adj_exac_fin 0.0
[2019-10-05T20:11Z] af_adj_exac_nfe 0.0
[2019-10-05T20:11Z] af_adj_exac_oth 0.0
[2019-10-05T20:11Z] af_adj_exac_sas 0.0
[2019-10-05T20:11Z] af_esp_aa -1.0
[2019-10-05T20:11Z] af_esp_all -1.0
[2019-10-05T20:11Z] af_esp_ea -1.0
[2019-10-05T20:11Z] af_exac_all 5.88199982303e-05
[2019-10-05T20:11Z] alt A
[2019-10-05T20:11Z] an 6
[2019-10-05T20:11Z] an_adj_exac_afr 1262.0
[2019-10-05T20:11Z] an_adj_exac_amr 6770.0
[2019-10-05T20:11Z] an_adj_exac_eas 2486.0
[2019-10-05T20:11Z] an_adj_exac_fin 1828.0
[2019-10-05T20:11Z] an_adj_exac_nfe 16896.0
[2019-10-05T20:11Z] an_adj_exac_oth 240.0
[2019-10-05T20:11Z] an_adj_exac_sas 4520.0
[2019-10-05T20:11Z] an_exac_all 34002.0
[2019-10-05T20:11Z] baseqranksum -1.42299997807
[2019-10-05T20:11Z] biotype None
[2019-10-05T20:11Z] call_rate 1.0
[2019-10-05T20:11Z] chrom chr1
[2019-10-05T20:11Z] clinvar_disease_name None
[2019-10-05T20:11Z] clinvar_sig None
[2019-10-05T20:11Z] clippingranksum 0.736999988556
[2019-10-05T20:11Z] codon_change None
[2019-10-05T20:11Z] common_pathogenic False
[2019-10-05T20:11Z] db False
[2019-10-05T20:11Z] decomposed False
[2019-10-05T20:11Z] dp 49
[2019-10-05T20:11Z] ds False
[2019-10-05T20:11Z] effect_severity None
[2019-10-05T20:11Z] end 146984765
[2019-10-05T20:11Z] ensembl_gene_id None
[2019-10-05T20:11Z] excesshet 3.97939991951
[2019-10-05T20:11Z] exon None
[2019-10-05T20:11Z] filter None
[2019-10-05T20:11Z] fs 0.0
[2019-10-05T20:11Z] gene None
[2019-10-05T20:11Z] gnomAD_AC 8
[2019-10-05T20:11Z] gnomAD_AF 0.000136870003189
[2019-10-05T20:11Z] gnomAD_AN (79506, 58450)
[2019-10-05T20:11Z] gnomad_ac 8
[2019-10-05T20:11Z] gnomad_af 0.000136870003189
[2019-10-05T20:11Z] gnomad_af_afr -1.0
[2019-10-05T20:11Z] gnomad_af_amr -1.0
[2019-10-05T20:11Z] gnomad_af_asj -1.0
[2019-10-05T20:11Z] gnomad_af_eas -1.0
[2019-10-05T20:11Z] gnomad_af_fin -1.0
[2019-10-05T20:11Z] gnomad_af_nfe -1.0
[2019-10-05T20:11Z] gnomad_af_oth -1.0
[2019-10-05T20:11Z] gnomad_af_popmax -1.0
[2019-10-05T20:11Z] gnomad_af_sas -1.0
[2019-10-05T20:11Z] gnomad_an (79506, 58450)
[2019-10-05T20:11Z] gt_alt_depths i
                                   ,
                                    
[2019-10-05T20:11Z] gt_alt_freqs dUD�?�q�q�?
[2019-10-05T20:11Z] gt_depths i
                               ,$  
[2019-10-05T20:11Z] gt_phases ?
[2019-10-05T20:11Z] gt_quals f
                              ,�B@A0A
[2019-10-05T20:11Z] gt_ref_depths i
                                   ,
[2019-10-05T20:11Z] gt_types i
                              ,
[2019-10-05T20:11Z] gts S
                         (G/AG/GG/A
[2019-10-05T20:11Z] impact None
[2019-10-05T20:11Z] impact_severity None
[2019-10-05T20:11Z] impact_so None
[2019-10-05T20:11Z] is_canonical False
[2019-10-05T20:11Z] is_coding False
[2019-10-05T20:11Z] is_exonic False
[2019-10-05T20:11Z] is_lof False
[2019-10-05T20:11Z] is_splicing False
[2019-10-05T20:11Z] len None
[2019-10-05T20:11Z] max_aaf_all 0.000295420002658
[2019-10-05T20:11Z] mleac 2
[2019-10-05T20:11Z] mleaf 0.333000004292
[2019-10-05T20:11Z] mmq (40, 40)
[2019-10-05T20:11Z] mq 37.5800018311
[2019-10-05T20:11Z] mq0 0
[2019-10-05T20:11Z] mqranksum -2.37800002098
[2019-10-05T20:11Z] num_exac_Het 2.0
[2019-10-05T20:11Z] num_exac_Hom 0.0
[2019-10-05T20:11Z] num_exac_het 2.0
[2019-10-05T20:11Z] num_exac_hom 0.0
[2019-10-05T20:11Z] num_het 2
[2019-10-05T20:11Z] num_hom_alt 0
[2019-10-05T20:11Z] num_hom_ref 1
[2019-10-05T20:11Z] num_unknown 0
[2019-10-05T20:11Z] old_multiallelic None
[2019-10-05T20:11Z] old_variant None
[2019-10-05T20:11Z] polyphen_pred None
[2019-10-05T20:11Z] polyphen_score None
[2019-10-05T20:11Z] qd 7.92000007629
[2019-10-05T20:11Z] qual 285.200012207
[2019-10-05T20:11Z] readposranksum -1.04200005531
[2019-10-05T20:11Z] ref G
[2019-10-05T20:11Z] rs_ids rs1437329596
[2019-10-05T20:11Z] sift_pred None
[2019-10-05T20:11Z] sift_score None
[2019-10-05T20:11Z] so None
[2019-10-05T20:11Z] sor 0.675000011921
[2019-10-05T20:11Z] start 146984764
[2019-10-05T20:11Z] sub_type ts
[2019-10-05T20:11Z] top_consequence None
[2019-10-05T20:11Z] transcript None
[2019-10-05T20:11Z] type snp
[2019-10-05T20:11Z] variant_id 7671
[2019-10-05T20:11Z] vcf_id rs1437329596
[2019-10-05T20:11Z] Traceback (most recent call last):
[2019-10-05T20:11Z]   File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
[2019-10-05T20:11Z]     impacts_extras=a.impacts_field, aok=a.a_ok)
[2019-10-05T20:11Z]   File "/RED/TOOLS/bin/vcf2db.py", line 234, in __init__
[2019-10-05T20:11Z]     self.load()
[2019-10-05T20:11Z]   File "/RED/TOOLS/bin/vcf2db.py", line 319, in load
[2019-10-05T20:11Z]     i = self._load(self.cache, create=True, start=1)
[2019-10-05T20:11Z]   File "/RED/TOOLS/bin/vcf2db.py", line 312, in _load
[2019-10-05T20:11Z]     self.insert(variants, expanded, keys, i, create=create)
[2019-10-05T20:11Z]   File "/RED/TOOLS/bin/vcf2db.py", line 374, in insert
[2019-10-05T20:11Z]     vilengths, variant_impacts)
[2019-10-05T20:11Z]   File "/RED/TOOLS/bin/vcf2db.py", line 402, in _insert
[2019-10-05T20:11Z]     self.__insert(v_objs, self.metadata.tables['variants'].insert())
[2019-10-05T20:11Z]   File "/RED/TOOLS/bin/vcf2db.py", line 436, in __insert
[2019-10-05T20:11Z]     raise e
[2019-10-05T20:11Z] sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 100 - probably unsupported type.
[2019-10-05T20:11Z] [SQL: INSERT INTO variants (variant_id, chrom, start, "end", vcf_id, ref, alt, qual, filter, type, sub_type, call_rate, num_hom_ref, num_het, num_hom_alt, num_unknown, aaf, gene, ensembl_gene_id, transcript, is_exonic, is_coding, is_lof, is_splicing, is_canonical, exon, codon_change, aa_change, aa_length, biotype, impact, impact_so, impact_severity, polyphen_pred, polyphen_score, sift_pred, sift_score, ac, af, an, baseqranksum, clippingranksum, db, decomposed, dp, ds, excesshet, fs, len, mleac, mleaf, mq, mq0, mqranksum, old_multiallelic, old_variant, qd, readposranksum, sor, ac_adj_exac_afr, ac_adj_exac_amr, ac_adj_exac_eas, ac_adj_exac_fin, ac_adj_exac_nfe, ac_adj_exac_oth, ac_adj_exac_sas, ac_exac_all, af_adj_exac_afr, af_adj_exac_amr, af_adj_exac_eas, af_adj_exac_fin, af_adj_exac_nfe, af_adj_exac_oth, af_adj_exac_sas, af_esp_aa, af_esp_all, af_esp_ea, af_exac_all, an_adj_exac_afr, an_adj_exac_amr, an_adj_exac_eas, an_adj_exac_fin, an_adj_exac_nfe, an_adj_exac_oth, an_adj_exac_sas, an_exac_all, clinvar_disease_name, clinvar_sig, common_pathogenic, gnomad_ac, gnomad_af, gnomad_af_afr, gnomad_af_amr, gnomad_af_asj, gnomad_af_eas, gnomad_af_fin, gnomad_af_nfe, gnomad_af_oth, gnomad_af_popmax, gnomad_af_sas, gnomad_an, max_aaf_all, num_exac_het, num_exac_hom, rs_ids, gts, gt_types, gt_phases, gt_depths, gt_ref_depths, gt_alt_depths, gt_quals, gt_alt_freqs) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[2019-10-05T20:11Z] [parameters: (7671, u'chr1', 146984764, 146984765, u'rs1437329596', u'G', u'A', 285.20001220703125, None, 'snp', 'ts', 1.0, 1, 2, 0, 0, 0.3333333333333333, None, None, None, 0, 0, 0, 0, 0, None, None, None, None, None, None, None, None, None, None, None, None, 2, 0.3330000042915344, 6, -1.4229999780654907, 0.7369999885559082, 0, 0, 49, 0, 3.9793999195098877, 0.0, None, 2, 0.3330000042915344, 37.58000183105469, 0, -2.378000020980835, None, 'None', 7.920000076293945, -1.0420000553131104, 0.675000011920929, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0002954200026579201, 0.0, 0.0, 0.0, 0.0, 0.0, -1.0, -1.0, -1.0, 5.881999823031947e-05, 1262.0, 6770.0, 2486.0, 1828.0, 16896.0, 240.0, 4520.0, 34002.0, None, None, 0, 8, 0.00013687000318896025, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, (79506, 58450), 0.0002954200026579201, 2.0, 0.0, u'rs1437329596', <read-only buffer for 0x7f452b0f9148, size -1, offset 0 at 0x7f452a506270>, <read-only buffer for 0x7f452b0f9110, size -1, offset 0 at 0x7f452a506230>, <read-only buffer for 0x7f452b0f7ea0, size -1, offset 0 at 0x7f452a5061f0>, <read-only buffer for 0x7f452b0f9180, size -1, offset 0 at 0x7f452a5061b0>, <read-only buffer for 0x7f452b0f91b8, size -1, offset 0 at 0x7f452a506170>, <read-only buffer for 0x7f452b0f91f0, size -1, offset 0 at 0x7f452a506130>, <read-only buffer for 0x7f452b0f9228, size -1, offset 0 at 0x7f452a5060f0>, <read-only buffer for 0x7f452b0f84b0, size -1, offset 0 at 0x7f452a5060b0>)]
[2019-10-05T20:11Z] (Background on this error at: http://sqlalche.me/e/rvf5)
[2019-10-05T20:11Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/bcbiotx/tmpna8xq85x/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype.db
codon_change None
common_pathogenic False
db False
decomposed False
dp 49
ds False
effect_severity None
end 146984765
ensembl_gene_id None
excesshet 3.97939991951
exon None
filter None
fs 0.0
gene None
gnomAD_AC 8
gnomAD_AF 0.000136870003189
gnomAD_AN (79506, 58450)
gnomad_ac 8
gnomad_af 0.000136870003189
gnomad_af_afr -1.0
gnomad_af_amr -1.0
gnomad_af_asj -1.0
gnomad_af_eas -1.0
gnomad_af_fin -1.0
gnomad_af_nfe -1.0
gnomad_af_oth -1.0
gnomad_af_popmax -1.0
gnomad_af_sas -1.0
gnomad_an (79506, 58450)
gt_alt_depths i
               ,
                
gt_alt_freqs dUD�?�q�q�?
gt_depths i
           ,$  
gt_phases ?
gt_quals f
          ,�B@A0A
gt_ref_depths i
               ,
gt_types i
          ,
gts S
     (G/AG/GG/A
impact None
impact_severity None
impact_so None
is_canonical False
is_coding False
is_exonic False
is_lof False
is_splicing False
len None
max_aaf_all 0.000295420002658
mleac 2
mleaf 0.333000004292
mmq (40, 40)
mq 37.5800018311
mq0 0
mqranksum -2.37800002098
num_exac_Het 2.0
num_exac_Hom 0.0
num_exac_het 2.0
num_exac_hom 0.0
num_het 2
num_hom_alt 0
num_hom_ref 1
num_unknown 0
old_multiallelic None
old_variant None
polyphen_pred None
polyphen_score None
qd 7.92000007629
qual 285.200012207
readposranksum -1.04200005531
ref G
rs_ids rs1437329596
sift_pred None
sift_score None
so None
sor 0.675000011921
start 146984764
sub_type ts
top_consequence None
transcript None
type snp
variant_id 7671
vcf_id rs1437329596
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 234, in __init__
    self.load()
  File "/RED/TOOLS/bin/vcf2db.py", line 319, in load
    i = self._load(self.cache, create=True, start=1)
  File "/RED/TOOLS/bin/vcf2db.py", line 312, in _load
    self.insert(variants, expanded, keys, i, create=create)
  File "/RED/TOOLS/bin/vcf2db.py", line 374, in insert
    vilengths, variant_impacts)
  File "/RED/TOOLS/bin/vcf2db.py", line 402, in _insert
    self.__insert(v_objs, self.metadata.tables['variants'].insert())
  File "/RED/TOOLS/bin/vcf2db.py", line 436, in __insert
    raise e
sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 100 - probably unsupported type.
[SQL: INSERT INTO variants (variant_id, chrom, start, "end", vcf_id, ref, alt, qual, filter, type, sub_type, call_rate, num_hom_ref, num_het, num_hom_alt, num_unknown, aaf, gene, ensembl_gene_id, transcript, is_exonic, is_coding, is_lof, is_splicing, is_canonical, exon, codon_change, aa_change, aa_length, biotype, impact, impact_so, impact_severity, polyphen_pred, polyphen_score, sift_pred, sift_score, ac, af, an, baseqranksum, clippingranksum, db, decomposed, dp, ds, excesshet, fs, len, mleac, mleaf, mq, mq0, mqranksum, old_multiallelic, old_variant, qd, readposranksum, sor, ac_adj_exac_afr, ac_adj_exac_amr, ac_adj_exac_eas, ac_adj_exac_fin, ac_adj_exac_nfe, ac_adj_exac_oth, ac_adj_exac_sas, ac_exac_all, af_adj_exac_afr, af_adj_exac_amr, af_adj_exac_eas, af_adj_exac_fin, af_adj_exac_nfe, af_adj_exac_oth, af_adj_exac_sas, af_esp_aa, af_esp_all, af_esp_ea, af_exac_all, an_adj_exac_afr, an_adj_exac_amr, an_adj_exac_eas, an_adj_exac_fin, an_adj_exac_nfe, an_adj_exac_oth, an_adj_exac_sas, an_exac_all, clinvar_disease_name, clinvar_sig, common_pathogenic, gnomad_ac, gnomad_af, gnomad_af_afr, gnomad_af_amr, gnomad_af_asj, gnomad_af_eas, gnomad_af_fin, gnomad_af_nfe, gnomad_af_oth, gnomad_af_popmax, gnomad_af_sas, gnomad_an, max_aaf_all, num_exac_het, num_exac_hom, rs_ids, gts, gt_types, gt_phases, gt_depths, gt_ref_depths, gt_alt_depths, gt_quals, gt_alt_freqs) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: (7671, u'chr1', 146984764, 146984765, u'rs1437329596', u'G', u'A', 285.20001220703125, None, 'snp', 'ts', 1.0, 1, 2, 0, 0, 0.3333333333333333, None, None, None, 0, 0, 0, 0, 0, None, None, None, None, None, None, None, None, None, None, None, None, 2, 0.3330000042915344, 6, -1.4229999780654907, 0.7369999885559082, 0, 0, 49, 0, 3.9793999195098877, 0.0, None, 2, 0.3330000042915344, 37.58000183105469, 0, -2.378000020980835, None, 'None', 7.920000076293945, -1.0420000553131104, 0.675000011920929, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0002954200026579201, 0.0, 0.0, 0.0, 0.0, 0.0, -1.0, -1.0, -1.0, 5.881999823031947e-05, 1262.0, 6770.0, 2486.0, 1828.0, 16896.0, 240.0, 4520.0, 34002.0, None, None, 0, 8, 0.00013687000318896025, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, (79506, 58450), 0.0002954200026579201, 2.0, 0.0, u'rs1437329596', <read-only buffer for 0x7f452b0f9148, size -1, offset 0 at 0x7f452a506270>, <read-only buffer for 0x7f452b0f9110, size -1, offset 0 at 0x7f452a506230>, <read-only buffer for 0x7f452b0f7ea0, size -1, offset 0 at 0x7f452a5061f0>, <read-only buffer for 0x7f452b0f9180, size -1, offset 0 at 0x7f452a5061b0>, <read-only buffer for 0x7f452b0f91b8, size -1, offset 0 at 0x7f452a506170>, <read-only buffer for 0x7f452b0f91f0, size -1, offset 0 at 0x7f452a506130>, <read-only buffer for 0x7f452b0f9228, size -1, offset 0 at 0x7f452a5060f0>, <read-only buffer for 0x7f452b0f84b0, size -1, offset 0 at 0x7f452a5060b0>)]
(Background on this error at: http://sqlalche.me/e/rvf5)
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 188, in variant2pipeline
    samples = population.prep_db_parallel(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 403, in prep_db_parallel
    output = parallel_fn("prep_gemini_db", to_process)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 921, in __call__
    if self.dispatch_one_batch(iterator):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 549, in __init__
    self.results = batch()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 383, in prep_gemini_db
    return population.prep_gemini_db(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 50, in prep_gemini_db
    gemini_db = create_gemini_db(ann_vcf, data, gemini_db, ped_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 142, in create_gemini_db
    do.run(cmd, "GEMINI: create database with vcf2db")
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/bcbiotx/tmpna8xq85x/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype.db
codon_change None
common_pathogenic False
db False
decomposed False
dp 49
ds False
effect_severity None
end 146984765
ensembl_gene_id None
excesshet 3.97939991951
exon None
filter None
fs 0.0
gene None
gnomAD_AC 8
gnomAD_AF 0.000136870003189
gnomAD_AN (79506, 58450)
gnomad_ac 8
gnomad_af 0.000136870003189
gnomad_af_afr -1.0
gnomad_af_amr -1.0
gnomad_af_asj -1.0
gnomad_af_eas -1.0
gnomad_af_fin -1.0
gnomad_af_nfe -1.0
gnomad_af_oth -1.0
gnomad_af_popmax -1.0
gnomad_af_sas -1.0
gnomad_an (79506, 58450)
gt_alt_depths i
               ,
                
gt_alt_freqs dUD�?�q�q�?
gt_depths i
           ,$  
gt_phases ?
gt_quals f
          ,�B@A0A
gt_ref_depths i
               ,
gt_types i
          ,
gts S
     (G/AG/GG/A
impact None
impact_severity None
impact_so None
is_canonical False
is_coding False
is_exonic False
is_lof False
is_splicing False
len None
max_aaf_all 0.000295420002658
mleac 2
mleaf 0.333000004292
mmq (40, 40)
mq 37.5800018311
mq0 0
mqranksum -2.37800002098
num_exac_Het 2.0
num_exac_Hom 0.0
num_exac_het 2.0
num_exac_hom 0.0
num_het 2
num_hom_alt 0
num_hom_ref 1
num_unknown 0
old_multiallelic None
old_variant None
polyphen_pred None
polyphen_score None
qd 7.92000007629
qual 285.200012207
readposranksum -1.04200005531
ref G
rs_ids rs1437329596
sift_pred None
sift_score None
so None
sor 0.675000011921
start 146984764
sub_type ts
top_consequence None
transcript None
type snp
variant_id 7671
vcf_id rs1437329596
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 234, in __init__
    self.load()
  File "/RED/TOOLS/bin/vcf2db.py", line 319, in load
    i = self._load(self.cache, create=True, start=1)
  File "/RED/TOOLS/bin/vcf2db.py", line 312, in _load
    self.insert(variants, expanded, keys, i, create=create)
  File "/RED/TOOLS/bin/vcf2db.py", line 374, in insert
    vilengths, variant_impacts)
  File "/RED/TOOLS/bin/vcf2db.py", line 402, in _insert
    self.__insert(v_objs, self.metadata.tables['variants'].insert())
  File "/RED/TOOLS/bin/vcf2db.py", line 436, in __insert
    raise e
sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 100 - probably unsupported type.
[SQL: INSERT INTO variants (variant_id, chrom, start, "end", vcf_id, ref, alt, qual, filter, type, sub_type, call_rate, num_hom_ref, num_het, num_hom_alt, num_unknown, aaf, gene, ensembl_gene_id, transcript, is_exonic, is_coding, is_lof, is_splicing, is_canonical, exon, codon_change, aa_change, aa_length, biotype, impact, impact_so, impact_severity, polyphen_pred, polyphen_score, sift_pred, sift_score, ac, af, an, baseqranksum, clippingranksum, db, decomposed, dp, ds, excesshet, fs, len, mleac, mleaf, mq, mq0, mqranksum, old_multiallelic, old_variant, qd, readposranksum, sor, ac_adj_exac_afr, ac_adj_exac_amr, ac_adj_exac_eas, ac_adj_exac_fin, ac_adj_exac_nfe, ac_adj_exac_oth, ac_adj_exac_sas, ac_exac_all, af_adj_exac_afr, af_adj_exac_amr, af_adj_exac_eas, af_adj_exac_fin, af_adj_exac_nfe, af_adj_exac_oth, af_adj_exac_sas, af_esp_aa, af_esp_all, af_esp_ea, af_exac_all, an_adj_exac_afr, an_adj_exac_amr, an_adj_exac_eas, an_adj_exac_fin, an_adj_exac_nfe, an_adj_exac_oth, an_adj_exac_sas, an_exac_all, clinvar_disease_name, clinvar_sig, common_pathogenic, gnomad_ac, gnomad_af, gnomad_af_afr, gnomad_af_amr, gnomad_af_asj, gnomad_af_eas, gnomad_af_fin, gnomad_af_nfe, gnomad_af_oth, gnomad_af_popmax, gnomad_af_sas, gnomad_an, max_aaf_all, num_exac_het, num_exac_hom, rs_ids, gts, gt_types, gt_phases, gt_depths, gt_ref_depths, gt_alt_depths, gt_quals, gt_alt_freqs) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: (7671, u'chr1', 146984764, 146984765, u'rs1437329596', u'G', u'A', 285.20001220703125, None, 'snp', 'ts', 1.0, 1, 2, 0, 0, 0.3333333333333333, None, None, None, 0, 0, 0, 0, 0, None, None, None, None, None, None, None, None, None, None, None, None, 2, 0.3330000042915344, 6, -1.4229999780654907, 0.7369999885559082, 0, 0, 49, 0, 3.9793999195098877, 0.0, None, 2, 0.3330000042915344, 37.58000183105469, 0, -2.378000020980835, None, 'None', 7.920000076293945, -1.0420000553131104, 0.675000011920929, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0002954200026579201, 0.0, 0.0, 0.0, 0.0, 0.0, -1.0, -1.0, -1.0, 5.881999823031947e-05, 1262.0, 6770.0, 2486.0, 1828.0, 16896.0, 240.0, 4520.0, 34002.0, None, None, 0, 8, 0.00013687000318896025, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, (79506, 58450), 0.0002954200026579201, 2.0, 0.0, u'rs1437329596', <read-only buffer for 0x7f452b0f9148, size -1, offset 0 at 0x7f452a506270>, <read-only buffer for 0x7f452b0f9110, size -1, offset 0 at 0x7f452a506230>, <read-only buffer for 0x7f452b0f7ea0, size -1, offset 0 at 0x7f452a5061f0>, <read-only buffer for 0x7f452b0f9180, size -1, offset 0 at 0x7f452a5061b0>, <read-only buffer for 0x7f452b0f91b8, size -1, offset 0 at 0x7f452a506170>, <read-only buffer for 0x7f452b0f91f0, size -1, offset 0 at 0x7f452a506130>, <read-only buffer for 0x7f452b0f9228, size -1, offset 0 at 0x7f452a5060f0>, <read-only buffer for 0x7f452b0f84b0, size -1, offset 0 at 0x7f452a5060b0>)]
(Background on this error at: http://sqlalche.me/e/rvf5)
' returned non-zero exit status 1.

I am very frustrated. Neither the PED nor the VCFANNO file cause the problem. Do you want me to upload the fastq files so you can have a look?

kokyriakidis commented 5 years ago

I have uploaded the whole project folder after the failed run so you can have a look!

https://drive.google.com/drive/folders/1BStwCZFMcFL7-6AruM723mbpSWH0XozS?usp=sharing
naumenko-sa commented 5 years ago

Hi @kokyriakidis and everyone!

I missed a huge chunk of this discussion :)

Also, maybe using batch: 178F is safer than the huge string you are using.

SN

kokyriakidis commented 5 years ago

Hi Sergey!

I would like to ask you about the custom vcf annotation file in cre and the warnings it throws when I run it with it. Did something change in the annotation fields? Can you propose a fix in order to work fine?

naumenko-sa commented 5 years ago

Sure, let us just finish first with the failure.

The last command which fails is

vcf2db.py \
_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz \
_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped \
_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype.db

It fails for me as well, complaining about field 100 in SQL query for variant rs1437329596. It is indeed a bad one: gnomAD_AN=79506,58450

Should be just one value. Now let us figure out where it came from. In your config you don't have vcfanno, so it comes from VEP? But you don't have CSQ field, so it is not VEP. Probably, you mixed several bcbio runs with different yaml settings and this variant comes from gnomad_exome.

I've checked this variant in gnomad exome:

chr1    146984765   G   T   7794.38 PASS
chr1    146984765   G   A   7794.38 PASS
chr1    146984765   G   A   8470.69 PASS

They do report two variants with the same coordinate and allele (G/A) and your sample was lucky to hit this variant.

So it is not a vcf2db.py or bcbio issue, it is a gnomad_exome issue.

@kokyriakidis, could you please

  1. delete work directory of your bcbio project
  2. rerun bcbio without vcfanno annotation in the config - should finish fine and confirm that it was a gnomad exome issue
  3. process your gnomad exome vcf file (in hg38/variation) with bcftools norm -d to delete duplicated records.
  4. rerun your project with vcfanno annotations.

if bcftools norm will help to resolve this, we can include it in gnomad processing script for bcbio installation.

I emailed gnomad team as well.

Sergey

kokyriakidis commented 5 years ago

@naumenko-sa

1) deleted

2) Rerun without vcfanno file and still get these errors:

[2019-10-06T10:22Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-decompose.vcf.gz
[2019-10-06T10:22Z] Annotating /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.vcf.gz with vcfanno, using /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini-combine.conf
[2019-10-06T10:22Z] =============================================
[2019-10-06T10:22Z] vcfanno version 0.3.2 [built with go1.12.1]
[2019-10-06T10:22Z] see: https://github.com/brentp/vcfanno
[2019-10-06T10:22Z] =============================================
[2019-10-06T10:22Z] vcfanno.go:115: found 61 sources from 5 files
[2019-10-06T10:22Z] vcfanno.go:145: using 2 worker threads to decompress bgzip file
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AFR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ACR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ANR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Male' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Female' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] vcfanno.go:194: Info Error: max_aaf_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T10:22Z] vcfanno.go:194: Info Error: Hom_Female not found in header >> this error/warning may occur many times. reporting once here...
[2019-10-06T10:22Z] vcfanno.go:194: Info Error: clinvar_sig not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T10:22Z] vcfanno.go:194: Info Error: af_adj_exac_sas not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T10:22Z] vcfanno.go:194: Info Error: af_esp_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T10:22Z] vcfanno.go:194: Info Error: CLNDN not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T10:26Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/exac.vcf.gz
[2019-10-06T10:26Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/esp.vcf.gz
[2019-10-06T10:26Z] vcfanno.go:248: annotated 148115 variants in 275.46 seconds (537.7 / second)
[2019-10-06T10:27Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz
[2019-10-06T10:27Z] GEMINI: create database with vcf2db
[2019-10-06T10:27Z] skipping 'MMQ' because it has Number=R
[2019-10-06T10:27Z] Ethnicity None
[2019-10-06T10:27Z] Traceback (most recent call last):
[2019-10-06T10:27Z]   File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
[2019-10-06T10:27Z]     impacts_extras=a.impacts_field, aok=a.a_ok)
[2019-10-06T10:27Z]   File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
[2019-10-06T10:27Z]     self.samples = self.create_samples()
[2019-10-06T10:27Z]   File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
[2019-10-06T10:27Z]     vals = [r[i] for r in rows]
[2019-10-06T10:27Z] IndexError: list index out of range
[2019-10-06T10:27Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/bcbiotx/tmp71dgss81/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype.db
skipping 'MMQ' because it has Number=R
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 188, in variant2pipeline
    samples = population.prep_db_parallel(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 403, in prep_db_parallel
    output = parallel_fn("prep_gemini_db", to_process)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 921, in __call__
    if self.dispatch_one_batch(iterator):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 549, in __init__
    self.results = batch()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 383, in prep_gemini_db
    return population.prep_gemini_db(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 50, in prep_gemini_db
    gemini_db = create_gemini_db(ann_vcf, data, gemini_db, ped_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 142, in create_gemini_db
    do.run(cmd, "GEMINI: create database with vcf2db")
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/bcbiotx/tmp71dgss81/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype.db
skipping 'MMQ' because it has Number=R
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.

3) I tried norm but I get

bcftools norm -d '/mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' 

The argument to -d not recognised: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz

I had to specify

bcftools norm -d all <vcf_file.gz>

When the process ended, the last command stated:

Lines   total/split/realigned/skipped:  14961513/0/0/0

Is this normal?

4) Rerun but with no luck:

[2019-10-06T12:21Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-decompose.vcf.gz
[2019-10-06T12:21Z] Annotating /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.vcf.gz with vcfanno, using /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini-combine.conf
[2019-10-06T12:21Z] =============================================
[2019-10-06T12:21Z] vcfanno version 0.3.2 [built with go1.12.1]
[2019-10-06T12:21Z] see: https://github.com/brentp/vcfanno
[2019-10-06T12:21Z] =============================================
[2019-10-06T12:21Z] vcfanno.go:115: found 61 sources from 5 files
[2019-10-06T12:21Z] vcfanno.go:145: using 2 worker threads to decompress bgzip file
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AFR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ACR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ANR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Male' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Female' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] vcfanno.go:194: Info Error: max_aaf_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T12:21Z] vcfanno.go:194: Info Error: Hom_Female not found in header >> this error/warning may occur many times. reporting once here...
[2019-10-06T12:21Z] vcfanno.go:194: Info Error: clinvar_sig not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T12:21Z] vcfanno.go:194: Info Error: af_adj_exac_sas not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T12:21Z] vcfanno.go:194: Info Error: af_esp_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T12:21Z] vcfanno.go:194: Info Error: CLNDN not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T12:26Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/exac.vcf.gz
[2019-10-06T12:26Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/esp.vcf.gz
[2019-10-06T12:26Z] vcfanno.go:248: annotated 148115 variants in 275.68 seconds (537.3 / second)
[2019-10-06T12:26Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz
[2019-10-06T12:26Z] GEMINI: create database with vcf2db
[2019-10-06T12:26Z] skipping 'MMQ' because it has Number=R
[2019-10-06T12:26Z] Ethnicity None
[2019-10-06T12:26Z] Traceback (most recent call last):
[2019-10-06T12:26Z]   File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
[2019-10-06T12:26Z]     impacts_extras=a.impacts_field, aok=a.a_ok)
[2019-10-06T12:26Z]   File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
[2019-10-06T12:26Z]     self.samples = self.create_samples()
[2019-10-06T12:26Z]   File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
[2019-10-06T12:26Z]     vals = [r[i] for r in rows]
[2019-10-06T12:26Z] IndexError: list index out of range
[2019-10-06T12:26Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/bcbiotx/tmpopsnbp30/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype.db
skipping 'MMQ' because it has Number=R
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 188, in variant2pipeline
    samples = population.prep_db_parallel(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 403, in prep_db_parallel
    output = parallel_fn("prep_gemini_db", to_process)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 921, in __call__
    if self.dispatch_one_batch(iterator):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 549, in __init__
    self.results = batch()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 383, in prep_gemini_db
    return population.prep_gemini_db(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 50, in prep_gemini_db
    gemini_db = create_gemini_db(ann_vcf, data, gemini_db, ped_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 142, in create_gemini_db
    do.run(cmd, "GEMINI: create database with vcf2db")
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/bcbiotx/tmpopsnbp30/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype.db
skipping 'MMQ' because it has Number=R
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.
kokyriakidis commented 5 years ago

@naumenko-sa Do we have any news?

naumenko-sa commented 5 years ago

Hi @kokyriakidis !

  1. You have removed vcfanno from the config, but the error is still there. I think it is because you have tools_on: gemini. That causes vcfanno to annotate with genomes/Hsapiens/hg38/config/vcfanno/gemini.conf, which includes the same variation/gnomad_exome.vcf.gz file. Could you please try to remove tools_off: gemini to confirm that failing step is indeed vcfanno?

  2. I've removed duplicates from variation/gnomad_exome.vcf.gz: Before removing duplicates there were 3 variants at chr1:146984765

    tabix gnomad_exome.vcf.gz chr1:146984765-146984765
    chr1    146984765   3   G   T   7794.38
    chr1    146984765   3   G   A   7794.38
    chr1    146984765   3   G   A   8470.69

    after removing duplicates with

    bcftools norm -dall -Oz gnomad_exome.vcf.gz > gnomad_exome.no_duplicates.vcf.gz
    tabix gnomad_exome.no_duplicates.vcf.gz

    the duplicate record is gone, but the allele A is gone as well

    #CHROM  POS ID  REF ALT QUAL    FILTER
    chr1    146984765   rs781933389 G   T   7794.38 PASS

    I've submitted a bug to https://github.com/samtools/bcftools/issues/1089

  3. I used a rival vt tool to remove duplicates: https://github.com/naumenko-sa/bioscripts/blob/master/variation/vcf.remove_duplicates.sh

this time the allele A is kept:

chr1    146984765   3   G   A   8470.69
chr1    146984765   3   G   T   7794.38
  1. I've replaced gnomad_exome.vcf.gz with gnomad_exome.no_duplicates.vcf.gz in the bcbio installation:

    cd genomes/Hsapiens/hg38/variation
    mv gnomad_exome.vcf.gz gnomad_exome.with_duplicates.vcf.gz
    ln -s gnomad_exome.no_duplicates.vcf.gz gnomad_exome.vcf.gz
  2. I am running your WES trio project as a test.

  3. I've submitted a fixed gnomad_exome recipe to remove duplicates in cloudbiolinux: https://github.com/chapmanb/cloudbiolinux/pull/325

SN

kokyriakidis commented 5 years ago

Thanks so much for all your help Sergey!

So in order to get it work I just update bcbio after cloudbiolinux PR is merged or do I have to do the steps you did?

naumenko-sa commented 5 years ago

Yes, you'd need to bcbio-nextgen.py upgrade -u skip --data, and let me know if you have any issues during installation of with your trio. I have not finished a trio run on my side (it is running), so there still might be issues. SN

kokyriakidis commented 5 years ago

Ok! I am waiting for your confirmation! Thanks again for helping and testing!

kokyriakidis commented 5 years ago

It seems that I get this error now when running with GEMINI:

[2019-10-12T08:23Z] Timing: joint squaring off/backfilling
[2019-10-12T08:23Z] Timing: variant post-processing
[2019-10-12T08:23Z] multiprocessing: postprocess_variants
[2019-10-12T08:23Z] Finalizing variant calls: F178CHILD, gatk-haplotype
[2019-10-12T08:23Z] Calculating variation effects for F178CHILD, gatk-haplotype
[2019-10-12T08:23Z] Annotate VCF file: F178CHILD, gatk-haplotype
[2019-10-12T08:23Z] Annotate with dbSNP
[2019-10-12T08:23Z] =============================================
[2019-10-12T08:23Z] vcfanno version 0.3.2 [built with go1.12.1]
[2019-10-12T08:23Z] see: https://github.com/brentp/vcfanno
[2019-10-12T08:23Z] =============================================
[2019-10-12T08:23Z] vcfanno.go:115: found 1 sources from 1 files
[2019-10-12T08:23Z] vcfanno.go:145: using 2 worker threads to decompress bgzip file
[2019-10-12T08:23Z] vcfanno.go:194: Info Error: rs_ids not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-12T08:28Z] vcfanno.go:248: annotated 158712 variants in 275.09 seconds (576.9 / second)
[2019-10-12T08:28Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-annotated.vcf.gz
[2019-10-12T08:28Z] Filtering for F178CHILD, gatk-haplotype
[2019-10-12T08:28Z] Removing variants with missing alts from /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gatk-haplotype/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-annotated.vcf.gz.
[2019-10-12T08:28Z] bgzip _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-annotated-nomissingalt.vcf
[2019-10-12T08:28Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-annotated-nomissingalt.vcf.gz
[2019-10-12T08:28Z] Cutoff-based soft filtering /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gatk-haplotype/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-annotated-nomissingalt.vcf.gz with TYPE="snp" && (MQRankSum < -12.5 || ReadPosRankSum < -8.0 || QD < 2.0 || FS > 60.0 || (QD < 10.0 && AD[0:1] / (AD[0:1] + AD[0:0]) < 0.25 && ReadPosRankSum < 0.0) || MQ < 30.0) : F178CHILD
[2019-10-12T08:28Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-annotated-nomissingalt-filterSNP.vcf.gz
[2019-10-12T08:28Z] Cutoff-based soft filtering /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gatk-haplotype/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-annotated-nomissingalt-filterSNP.vcf.gz with TYPE="indel" && (ReadPosRankSum < -20.0 || QD < 2.0 || FS > 200.0 || SOR > 10.0 || (QD < 10.0 && AD[0:1] / (AD[0:1] + AD[0:0]) < 0.25 && ReadPosRankSum < 0.0)) : F178CHILD
[2019-10-12T08:28Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-annotated-nomissingalt-filterSNP-filterINDEL.vcf.gz
[2019-10-12T08:28Z] Prioritization for F178CHILD, gatk-haplotype
[2019-10-12T08:28Z] multiprocessing: split_variants_by_sample
[2019-10-12T08:28Z] Timing: prepped BAM merging
[2019-10-12T08:28Z] Timing: validation
[2019-10-12T08:28Z] multiprocessing: compare_to_rm
[2019-10-12T08:28Z] Timing: ensemble calling
[2019-10-12T08:28Z] Timing: validation summary
[2019-10-12T08:28Z] Timing: structural variation
[2019-10-12T08:28Z] Timing: structural variation
[2019-10-12T08:28Z] Timing: structural variation ensemble
[2019-10-12T08:28Z] Timing: structural variation validation
[2019-10-12T08:28Z] multiprocessing: validate_sv
[2019-10-12T08:28Z] Timing: heterogeneity
[2019-10-12T08:28Z] Timing: population database
[2019-10-12T08:28Z] multiprocessing: prep_gemini_db
[2019-10-12T08:28Z] Multi-allelic to single allele
[2019-10-12T08:28Z] normalize v0.5
[2019-10-12T08:28Z] options:     input VCF file                                  -
[2019-10-12T08:28Z]          [o] output VCF file                                 -
[2019-10-12T08:28Z]          [w] sorting window size                             10000
[2019-10-12T08:28Z]          [n] no fail on reference inconsistency for non SNPs true
[2019-10-12T08:28Z]          [q] quiet                                           false
[2019-10-12T08:28Z]          [d] debug                                           false
[2019-10-12T08:28Z]          [r] reference FASTA file                            /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa
[2019-10-12T08:28Z] decompose v0.5
[2019-10-12T08:28Z] options:     input VCF file        -
[2019-10-12T08:28Z]          [s] smart decomposition   true (experimental)
[2019-10-12T08:28Z]          [o] output VCF file       -
[2019-10-12T08:28Z] stats: no. variants                 : 147535
[2019-10-12T08:28Z]        no. biallelic variants       : 146972
[2019-10-12T08:28Z]        no. multiallelic variants    : 563
[2019-10-12T08:28Z]        no. additional biallelics    : 580
[2019-10-12T08:28Z]        total no. of biallelics      : 148115
[2019-10-12T08:28Z] Time elapsed: 3.03s
[2019-10-12T08:28Z] stats: biallelic
[2019-10-12T08:28Z]           no. left trimmed                      : 0
[2019-10-12T08:28Z]           no. right trimmed                     : 218
[2019-10-12T08:28Z]           no. left and right trimmed            : 0
[2019-10-12T08:28Z]           no. right trimmed and left aligned    : 0
[2019-10-12T08:28Z]           no. left aligned                      : 2
[2019-10-12T08:28Z]        total no. biallelic normalized           : 220
[2019-10-12T08:28Z]        multiallelic
[2019-10-12T08:28Z]           no. left trimmed                      : 0
[2019-10-12T08:28Z]           no. right trimmed                     : 0
[2019-10-12T08:28Z]           no. left and right trimmed            : 0
[2019-10-12T08:28Z]           no. right trimmed and left aligned    : 0
[2019-10-12T08:28Z]           no. left aligned                      : 0
[2019-10-12T08:28Z]        total no. multiallelic normalized        : 0
[2019-10-12T08:28Z]        total no. variants normalized            : 220
[2019-10-12T08:28Z]        total no. variants observed              : 148115
[2019-10-12T08:28Z]        total no. reference observed             : 0
[2019-10-12T08:28Z] Time elapsed: 3.25s
[2019-10-12T08:28Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-decompose.vcf.gz
[2019-10-12T08:28Z] Annotating /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.vcf.gz with vcfanno, using /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini-combine.conf
[2019-10-12T08:28Z] =============================================
[2019-10-12T08:28Z] vcfanno version 0.3.2 [built with go1.12.1]
[2019-10-12T08:28Z] see: https://github.com/brentp/vcfanno
[2019-10-12T08:28Z] =============================================
[2019-10-12T08:28Z] vcfanno.go:115: found 61 sources from 5 files
[2019-10-12T08:28Z] vcfanno.go:145: using 2 worker threads to decompress bgzip file
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AFR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ACR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ANR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Male' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Female' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] vcfanno.go:194: Info Error: max_aaf_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-12T08:28Z] vcfanno.go:194: Info Error: Hom_Female not found in header >> this error/warning may occur many times. reporting once here...
[2019-10-12T08:28Z] vcfanno.go:194: Info Error: clinvar_sig not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-12T08:28Z] vcfanno.go:194: Info Error: af_adj_exac_sas not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-12T08:28Z] vcfanno.go:194: Info Error: af_esp_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-12T08:28Z] vcfanno.go:194: Info Error: CLNDN not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-12T08:32Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/exac.vcf.gz
[2019-10-12T08:32Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/esp.vcf.gz
[2019-10-12T08:32Z] vcfanno.go:248: annotated 148115 variants in 277.93 seconds (532.9 / second)
[2019-10-12T08:32Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz
[2019-10-12T08:32Z] GEMINI: create database with vcf2db
[2019-10-12T08:32Z] skipping 'MMQ' because it has Number=R
[2019-10-12T08:32Z] Ethnicity None
[2019-10-12T08:32Z] Traceback (most recent call last):
[2019-10-12T08:32Z]   File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
[2019-10-12T08:32Z]     impacts_extras=a.impacts_field, aok=a.a_ok)
[2019-10-12T08:32Z]   File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
[2019-10-12T08:32Z]     self.samples = self.create_samples()
[2019-10-12T08:32Z]   File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
[2019-10-12T08:32Z]     vals = [r[i] for r in rows]
[2019-10-12T08:32Z] IndexError: list index out of range
[2019-10-12T08:32Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/bcbiotx/tmp6wl69_05/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype.db
skipping 'MMQ' because it has Number=R
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 188, in variant2pipeline
    samples = population.prep_db_parallel(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 403, in prep_db_parallel
    output = parallel_fn("prep_gemini_db", to_process)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 921, in __call__
    if self.dispatch_one_batch(iterator):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 549, in __init__
    self.results = batch()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 383, in prep_gemini_db
    return population.prep_gemini_db(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 50, in prep_gemini_db
    gemini_db = create_gemini_db(ann_vcf, data, gemini_db, ped_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 142, in create_gemini_db
    do.run(cmd, "GEMINI: create database with vcf2db")
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/bcbiotx/tmp6wl69_05/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype.db
skipping 'MMQ' because it has Number=R
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.

Without GEMINI everything worked fine

kokyriakidis commented 5 years ago

Hi Sergey and the rest!

Will sticking to gnomad 2.1 help with this issue?

roryk commented 5 years ago

This is still due to a PED file problem-- could you send along the PED file you are using?

kokyriakidis commented 5 years ago

178F_PED.zip

roryk commented 5 years ago

Thanks, could you also pass along the YAML file you are using?

kokyriakidis commented 5 years ago

F178.zip

roryk commented 5 years ago

Thanks, how about this file? /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped

kokyriakidis commented 5 years ago

_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.zip

kokyriakidis commented 5 years ago

@roryk The PED file is fine right?

roryk commented 5 years ago

Hi @kokyriakidis,

Yes, it looks fine to me. Could you also send along the VCF file or a snippet from it so I can test locally?

kokyriakidis commented 5 years ago

The whole project is here:

https://drive.google.com/drive/folders/1BStwCZFMcFL7-6AruM723mbpSWH0XozS?usp=sharing
roryk commented 5 years ago

Thanks, the PED file and the VCF file don't have the same names. The PED file has the names as F178CHILD but the VCF file has them named F178_CHILD.

kokyriakidis commented 5 years ago

@roryk Sorry this was from another run. Please check again the link above. I have uploaded the updated run (which also fails)

roryk commented 5 years ago

Thanks, it looks like the PED file has the ethnicity column but no entry for it. If you add an ethnicity column to the PED file and populate it with -9 it should fix this problem.

kokyriakidis commented 5 years ago

hmm thanks! Maybe update the readme, so others know that