Closed davidecarlson closed 3 years ago
I suspect it's the following block of code from the vcf_reader.cpp that's throwing the error:
if (file_type == ".vcf") {
auto vcf_file_size = fs::file_size(file_path);
if (vcf_file_size > 1e9) { // 1GB
throw std::runtime_error {"VCF file " + file_path.string() + " is too big"};
}
Is there a particular reason why this 1 Gb vcf file size limit is in place?
Should I disable variant filtering within Octopus and do it manually with another tool (e.g., vcftools/bcftools)? Thanks! Dave
This is due to using uncompressed VCF as the output format - if you change the output to octopus_out/1_30.octopus.vcf.gz
then you won't get this error.
The motivation behind this was to avoid situations where users might accidentally provide very large unindexed source VCF files as this requires linear (slow) searches for subregions. I would agree that this is not a very elegant way of achieving this - nor is the error very informative! I'll leave this issue open to remind me to do something better. Moreover, the VCF is streamed for filtering so I should prevent this from triggering here.
A self reminder to consider adding a warning that uncompressed VCF has been selected as output.
Thanks, Dan! Writing compressed output fixed the issue, like you said. Best, Dave
Changed exception to warning in case of large uncompressed input (c611a2c9e727d8f77f48124d5f9d07972aa191e5). Also added warning for uncompressed output (94283bec5a562ad19ac809acc63338417ad606fc).
Describe the bug The variant calling step appeared to work properly, but the filtering step fails. Here are the last several lines of the debug log:
Version
Command Command line to install octopus:
Command line to run octopus:
Additional context This is a follow up to #176. I added another 10 samples to the analysis and re-ran the joint calling with the source variants previously discovered in these samples. I'm not sure what specifically the error message is referring to when it says the unfiltered vcf is "too big". The bgzipped unfiltered vcf file is ~100 Mb in size. I assume that's too big to attach, but if you want to take a look at it, I'm happy to send via another route.
Any thoughts?
Thanks for all your help! Dave