Open bioinfornatics opened 3 years ago
@bioinfornatics Oh - can you run od -c -N 30 <file>
and include the results here. You may be hitting this embarrassing bug which we've never fixed https://github.com/samtools/htsjdk/issues/946.
@cmnbroad thanks I was trying to get the magic number as it is done into BCFVersion.java with hexdump -C
but I had any idea to do next :-)
So the result of od -c -N 30
0000000 037 213 \b 004 \0 \0 \0 \0 \0 377 006 \0 B C 002 \0
0000020 250 003 255 224 313 213 023 I 034 307 333 L 226 245
Yes, that looks like a gzip header. Sadly, htsjdk lacks support for gzipped BCF, as mentioned above.
@bioinfornatics you can try to use a VCFIteratorBuilder instead of a VCFFileReader . As far as I remember, the GZ compression was first tested. (But you cannot do random-access)
thanks @lindenb and @cmnbroad
Arf … I though using the index file (or/and the gzip block) to read the bcf file in // by region.
Dear,
Firstly thanks for your amazing work on this library.
I try to read a bcffile through
VCFFileReader
but I have this error:This file is valid and I am able to read it with
pysam
in pythonI use htsjdk:
2.23.0
java:8+
I need to increase the speed process due to slowness of python and is GIL limitation which do not allow the use of multithreading to read the file through a random access way
Thanks for your help