samtools / htscodecs

Custom compression for CRAM and others.
Other
30 stars 18 forks source link

Compress can not handle blocks larger than 1.25 Mib #128

Open mrietveld opened 1 week ago

mrietveld commented 1 week ago

In using the the RANS codec here (THANK YOU!), I've noticed that rans_compress_to_4x16 dumps/does not complete on blocks of input larger than 1.25 * 2^20 (1 Mib).

If this is not something that you plan to support, or are interested in supporting, please feel free to close the issue.

Otherwise, if this is a valid bug, I can add a short test case very easily. Thanks!

I'm not completely sure which RANS variant (order, pack, RLE, stripe) fails, but when I looked into it, it seemed like it affected several of them. A test case will of course clarify this.

jkbonfield commented 5 days ago

I cannot reproduce this using ./tests/rans4x16pr. I've tried:

for c in 0x0000 0x0101 0x0202 0x0404;do echo === $c ===;./tests/rans4x16pr -c $c -o196 -t -b `expr 10 "*" 1024 "*" 1024` ~/scratch/data/enwik8;done

This tests scalar, SSE4, AVX2 and AVX512 implementations. I've also done it with order-0, order-1, order-4 (SIMD) and order-5 (SIMD) as well. There's nothing obvious in there that would limit this to 1MiB blocks. It's possible once we get close to 2GiB we may be overflowing things with various bits of code, but that's extreme and not what the library was designed for. It's generally much faster to serialise smaller blocks as they fit in the cache.

Can you provide a concrete example of it failing please?