Closed chapmanb closed 7 years ago
How frequently do these occur? I have an idea about what to change, if I send a binary, could you test and have a good idea if it's been resolved?
It's pretty infrequent and only under high load but we have a couple of ongoing projects where it happens more regularly (100s of samples on AWS EBS volumes). I can also just pull in a new version for bioconda and push to see if we still see them intermittently. Sorry to not have a reproducible case or anything useful. Thanks for thinking about this.
can you show the conf file you're using?
I added some decoration to the error messages I'll make a new release ASAP and then you should have more context on the error so I can dig further.
Brent; Brilliant, thanks for doing that. The configuration files where we've seen this most often is not very complex, just annotating with dbSNP:
[[annotation]]
file="/path/to/dbsnp.vcf.gz"
fields=["ID"]
names=["rs_ids"]
ops=["concat"]
Thanks again.
Hi Brad, I made a new release that has better error messages. Can you give it a try? That will help me to narrow it down.
https://github.com/brentp/vcfanno/releases/tag/v0.2.5
Also, if you can get a reasonably reproducible error, then you could try the vcfanno_linux64_race
binary which would give more info. I'm assuming there's some sort of race condition going on but haven't been able to track it down. Running under the race binary will be > 10X slower, so don't use that in production.
scratch that. I just found the race. I'll remove that release and fix.
Brad, that is fixed in this release: https://github.com/brentp/vcfanno/releases/tag/v0.2.6
I'll leave this issue open as we should be able to have parallel decompression, but I couldn't track down the cause so I just have single-threaded compression for each file (but vcfanno will still run chunks in parallel).
Brent; Thanks for identifying the underlying issue and saving the back and forth. I also bumped the bioconda package for this and will let you know if we spot anything else at all. Awesome work spotting the underlying issue so quickly.
this has also been fixed upstream in biogo/hts/bgzf so the next release will restore the multiple decompression threads per annotation.
Brent; We've incorporated vcfanno into bcbio with a ton of success. It's been awesome to have general flexibility for annotation. Now that we're starting to test at scale we've been seeing intermittent issues with reading VCF files. These appear to be IO related issues from the error messages and aren't reproducible -- the files themselves are fine and just re-running the same command works.
I've been trying to collect error cases and the issue is reported after:
We then see errors and a failed command with these errors:
or
I know this is not a great report but I don't have much more to go on from my side. Do you know if there are ways we could make vcfanno more resilient to IO/read issues? Thanks for any pointers or ideas to tackle.