KarchinLab / open-cravat

A modular annotation tool for genomic variants
MIT License
113 stars 27 forks source link

silent failures from multiple annotators #205

Closed cariaso closed 7 months ago

cariaso commented 8 months ago

I run this command

running ['oc', 'run', '/tmp/opencravat-open-cravat-1705148005uika61i0/d2c77d62-1e83-4bfb-908e-42230e123649-1705110875/DanteLabs.HG38.bamlane_1.strelka.variants.vcf.gz', '--liftover', 'hg38', '--mp', '1', '--skip', 'reporter', '-d', '/tmp/opencravat-open-cravat-1705148005uika61i0/outdir', '--md', '/mnt/modules', '-a', 'clinvar', 'clingen', 'loftool', 'dbsnp', 'gnomad3', 'mutation_assessor', 'omim', 'pharmgkb', 'phdsnpg', 'sift', 'thousandgenomes', 'clingen_allele_registry']

Input file(s): /tmp/opencravat-open-cravat-1705148005uika61i0/d2c77d62-1e83-4bfb-908e-42230e123649-1705110875/DanteLabs.HG38.bamlane_1.strelka.variants.vcf.gz
Genome assembly: hg38
Running converter...
    Converter (converter)           finished in 576.979s
Running gene mapper...                  finished in 1397.280s
Running annotators...
    annotator(s) finished in 6289.804s
Running aggregator...
    Variants                        finished in 338.064s
    Genes                           finished in 0.320s
    Samples                         finished in 99.768s
    Tags                            finished in 150.135s
Indexing
    variant base__coding    finished in 4.804s
    variant base__chrom finished in 3.630s
    variant sift__prediction    finished in 5.043s
    variant base__so    finished in 5.435s
    gene clingen__mondo finished in 0.039s
    variant clingen_allele_registry__allele_registry_id finished in 4.323s
Running postaggregators...
    Tag Sampler (tagsampler)        finished in 499.833s
    VCF Info (vcfinfo)              finished in 555.312s
Finished normally. Runtime: 9835.803s
oc command has completed

The resulting sqlite contains these columns ✅basechrom ✅dbsnprsid

and the expected others from those annotators, but these columns are never seen

❌clinvarid ❌gnomad3af ❌omim__omim_id ❌pharmgkbid ❌phdsnpgscore

nor any others from those annotators. There is nothing in the logs that seems to indicate an error, nor in the exit code.

This happens to me quite often, and reruns sometimes fix it, but some sources have the problem more often than they work successfully. My launching is scripted, so the input is extremely consistent.

  1. can I know this is happening without manually inspecting the sqlite
  2. can I avoid this happening
  3. what is happening
kmoad commented 8 months ago

This may be related to https://github.com/KarchinLab/open-cravat/issues/180 which was solved in dev, but hasn't been merged to master https://github.com/KarchinLab/open-cravat/commit/d2b6b59400fac6fdd09373d09166ac84db7858c6

Does the issue persist using dev branch?

cariaso commented 8 months ago

confirming that a run from the dev branch resolved my issue. that is quite worthy of going onto master and becoming the next release.