I generated a bgzipped VCF that passes validation if it's decompressed before calling the validator (zcat file.vcf.gz | vcf_validator), but doesn't pass validation if the validator itself decompresses the file (vcf_validator -i file.vcf.gz). If the file is decompressed and gzipped, it also works. It seems that depending on the block size, the program didn't read properly all lines.
A test was added to process this file (test/input_files/v4.3/compressed_files/readable/passed/boost_55_can_not_decompress_last_line_of_this_bgzipped.vcf.gz).
Confirmed with the bgzip people at
https://github.com/samtools/htslib/issues/953 that bgzip should be correctly read by any gzip reader. We found out that newer versions of boost decompress the offending file correctly.
Changed the build in linux to download and compile Boost, instead of installing from distro packages.
[x] Changed the install_dependencies.sh script for manual installation.
[x] Changed the travis linux build.
[ ] Changed the docker build. NOT DONE: it seems this failed even in master, so it's out of the scope of this ticket. The docker build is removed for now.
[x] Changed README to reflect the new (and easier) installation steps.
The OSX validator passes the test without modification, so the build was not changed for OSX.
Decompression for Windows has never worked and is low priority, so no changes there.
The binaries from travis for the release will be compiled with gcc 5 (the xenial default). The travis build for gcc 4.8 and gcc 6 were removed because they were not working, possibly due to ABI incompatibilities with ODB. Maybe compiling all dependencies with the same ABI fixes this, but that's out of scope of this ticket.
The locale problem https://github.com/EBIvariation/vcf-validator/issues/184 prevents from building in Ubuntu 18, but the precompiled binary (built in Ubuntu 16) can be run without problems in Ubuntu 18, so that issue is half done.
History of this issue:
zcat file.vcf.gz | vcf_validator
), but doesn't pass validation if the validator itself decompresses the file (vcf_validator -i file.vcf.gz
). If the file is decompressed and gzipped, it also works. It seems that depending on the block size, the program didn't read properly all lines.The cause of https://github.com/EBIvariation/vcf-validator/issues/190 wasn't identified, but this PR maybe fixes it, as the validator now reads correctly bgzip files.