Open mrietveld opened 1 week ago
I cannot reproduce this using ./tests/rans4x16pr
. I've tried:
for c in 0x0000 0x0101 0x0202 0x0404;do echo === $c ===;./tests/rans4x16pr -c $c -o196 -t -b `expr 10 "*" 1024 "*" 1024` ~/scratch/data/enwik8;done
This tests scalar, SSE4, AVX2 and AVX512 implementations. I've also done it with order-0, order-1, order-4 (SIMD) and order-5 (SIMD) as well. There's nothing obvious in there that would limit this to 1MiB blocks. It's possible once we get close to 2GiB we may be overflowing things with various bits of code, but that's extreme and not what the library was designed for. It's generally much faster to serialise smaller blocks as they fit in the cache.
Can you provide a concrete example of it failing please?
In using the the RANS codec here (THANK YOU!), I've noticed that
rans_compress_to_4x16
dumps/does not complete on blocks of input larger than 1.25 * 2^20 (1 Mib).If this is not something that you plan to support, or are interested in supporting, please feel free to close the issue.
Otherwise, if this is a valid bug, I can add a short test case very easily. Thanks!
I'm not completely sure which RANS variant (order, pack, RLE, stripe) fails, but when I looked into it, it seemed like it affected several of them. A test case will of course clarify this.