Open esahione opened 4 months ago
I'm sorry for your trouble.
First off, I'm surprised that you are running in debug mode with asserts in the first place as it is multiple times slower in my experience than release mode.
As for how you can help me debug this:
Thanks! Will do the above. By the way, if I set the chunk size to (1ull<<24) I don't get the issue as frequently. At (1ull<<16) I get it as frequently or maybe slightly more. Maybe there are empty chunks being sent to the pool?
I'll let you know what I can do re the above and put it here.
Thanks! Will do the above. By the way, if I set the chunk size to (1ull<<24) I don't get the issue as frequently. At (1ull<<16) I get it as frequently or maybe slightly more. Maybe there are empty chunks being sent to the pool?
1 << 16
is 64 KiB. I think the chunk size will be forced to a minimum of 128 KiB, i.e., setting it to anything lower will always result in a chunk size of 128 KiB. If the chunk size becomes too small, then the number of false positives will increase because there is an insufficient amount of data to check for redundancy in a single chunk. That's why I limited the chunk size like this. Furthermore, the overhead for searching for deflate block starts also becomes relatively larger for small chunk sizes. This is also the reason for the relatively large default of 4 MiB, simply because it yields the best performance in my benchmarks. I don't think empty chunks have much to do with this, although I'm not quire sure what you mean with empty.
The triggered assert has questionable importance. It checks that the caller requests 1 or more bits. There should be no code trying to read 0 bits. And that's good and required behavior because I removed a branch for performance reasons. That branch would be necessary thanks to undefined behavior when doing a shift such as uint64_t(1) << ( 64 - 0 )
, i.e., a shift as large as the bit width or larger. Normally, I would expect the result to be 0, but thanks to undefined behavior, the result will probably be random / the call may be optimized out completely. Then again, if 0 bits are requested, it may not even matter much to the caller what kind of value is received... Therefore, it would be really good to know which code actually triggers this. It might even expose a bug at some other place. Otherwise, I could simply add an if ( bitsWanted == 0 ) { return 0 }
check and remove the assert and be done with it.
Did you find time to generate a backtrace? That would help me the most. The other two debugging suggestions are only if the backtrace doesn't clear up anything. I may even be able to reconstruct a synthetic reproducer from the backtrace.
Apologies, busy with a newborn. Will get to it soon.
Hi,
I'm iterating over a few thousand gzipped files and I get the following sigabort randomly on line 716 of BitReader:
I'm using the latest master version and calling it from C++. It happens at random, without a particular pattern. But it happens consistently enough that after about 130-150 files it aborts.
Here's how I'm calling it:
I'm on Linux and using the C++ library (as per the code above) - the ZLIB version is the packaged version (I changed it from my system to see if it would stop the error to no avail). Each gzipped file is anywhere from 300mb to 5gb.
Any help would be appreciated - I have no idea where to begin since the backtrace is quite large and I'm not familiar with the internals of the library.
Thanks!