Open BorisMansencal opened 7 years ago
Hi Boris - I'll check into this ASAP. I'm guessing it's a bug in lzham_compress_memory(), because the main (low-level) API has been tested on extremely large files.
On Fri, Nov 4, 2016 at 10:27 AM, Boris Mansencal notifications@github.com wrote:
Hi,
First of all, thank you for this excellent open source codec.
I am using LZHAM to compress a rather large buffer (6140582460 bytes). I compress by chunks of maximum UINT32_MAX (=4294967295) bytes , but it fails (on one particular buffer).
Is UINT32_MAX allowed for lzham_compress_memory() or is it too large ?
In release I have a core dumped. When compiled in Debug mode, I have the following backtrace:
0 0x00007fffda8329c8 in raise () at /lib64/libc.so.6
1 0x00007fffda83465a in abort () at /lib64/libc.so.6
2 0x00007fffda82b187 in __assert_fail_base () at /lib64/libc.so.6
3 0x00007fffda82b232 in () at /lib64/libc.so.6
4 0x00007fffd9f4c238 in lzham_debug_break() () at /net/Tools/lzham_codec/lzhamdecomp/lzham_platform.cpp:67
5 0x00007fffd9f2ff7a in lzhamassert(char const, char const_, unsigned int) (pExp=0x7fffd9f538c8 "new_capacity && (new_capacity > m_capacity)", pFile=0x7fffd9f53840 "/net/Tools/lzham_codec/lzhamdecomp/lzham_vector.cpp", line=28) at /net/Tools/lzham_codec/lzhamdecomp/lzham_assert.cpp:24
6 0x00007fffd9f525d1 in lzham::elemental_vector::increasecapacity(unsigned int, bool, unsigned int, void ()(void, void, unsigned int), bool) (this=0x7ffe3e6155f8, min_new_capacity=2147603813, grow_hint=true, element_size=1, pMover=0x0, nofail=true)
at /net/Tools/lzham_codec/lzhamdecomp/lzham_vector.cpp:28
7 0x00007fffda16909c in lzham::vector
::increase_capacity(unsigned int, bool, bool) (this=0x7ffe3e6155f8, min_new_capacity=2147603813, grow_hint=true, nofail=true) at /net/Tools/lzham_codec/lzhamcomp/../lzhamdecomp/lzham_vector.h:573 8 0x00007fffda168877 in lzham::vector
::try_resize(unsigned int, bool) (this=0x7ffe3e6155f8, new_size=2147603813, grow_hint=true) at /net/Tools/lzham_codec/lzhamcomp/../lzhamdecomp/lzham_vector.h:183
9 0x00007fffda170238 in lzham::vector
::insert(unsigned int, unsigned char const_, unsigned int) (this=0x7ffe3e6155f8, index=2147331299, p=0xa6e3e0 "A9\222<υ\266\241\354\301\256\r\346\022\277\271hQ\374F\"V\226\035\356['3\255\346\t\357\217\n\324\304\344n\034\325G\320\300\031\231\223\375\061$\213\314\321;}IY\271\335x\201\r\202\071bf\315~\210D/\aF^\347\256k\315\037fj\243'AߪJ\032\372.\307\317\061y\227ԋ`\352\307U\271\201\067n=ޚ\377\231\227?_Gf\347\270\363\275\271\372ؾ\205\061\002t\246\321\061\026\021+\347\b\221d\263F\275\373\354\376\331mޣ\233t\342\070m\025\354\220-|6\t/\177\037ߖΪN^I\030<LQ\307\066\aD\377\300ޝ\027\312\301\262\017\027\325\365A\245\004\222I\244z\231\377pyi"..., n=272514) at /net/Tools/lzham_codec/lzhamcomp/../lzhamdecomp/lzham_vector.h:251
10 0x00007fffda16f8e4 in lzham::vector
::append(lzham::vector const&) (this=0x7ffe3e6155f8, other=...) at /net/Tools/lzham_codec/lzhamcomp/../lzhamdecomp/lzham_vector.h:297
11 0x00007fffda16e93d in lzham::lzcompressor::compress_blockinternal(void const, unsigned int) (this=0x7ffe3e612010, pBuf=0x7fff5226f010, buf_len=524288) at /net/Tools/lzham_codec/lzhamcomp/lzham_lzcomp_internal.cpp:1951
12 0x00007fffda16d31c in lzham::lzcompressor::compressblock(void const, unsigned int) (this=0x7ffe3e612010, pBuf=0x7fff5226f010, buf_len=524288)
at /net/Tools/lzham_codec/lzhamcomp/lzham_lzcomp_internal.cpp:1459
13 0x00007fffda16a85a in lzham::lzcompressor::putbytes(void const, unsigned int) (this=0x7ffe3e612010, pBuf=0x7ffe5a46f010, buf_len=4294967295)
at /net/Tools/lzham_codec/lzhamcomp/lzham_lzcomp_internal.cpp:515
14 0x00007fffda167e3f in lzham::lzham_lib_compress_memory(lzham_compressparams const, unsigned char, unsigned long, unsigned char const, unsigned long, unsigned int) (pParams=0x7fffffffdbb0, pDst_buf=0x7ffce0992010 "", pDst_len=0x7fffffffdb98, pSrc_buf=0x7ffe5a46f010 "?", src_len=4294967295, pAdler32=0x0) at /net/Tools/lzham_codec/lzhamcomp/lzham_lzcomp.cpp:353
15 0x00007ffff6b2ee06 in lzham_compress_memory(lzham_compressparams const, lzhamuint8, sizet, lzhamuint8 const, size_t, lzhamuint32) (pParams=0x7fffffffdbb0, pDst_buf=0x7ffce0992010 "", pDst_len=0x7fffffffdb98, pSrc_buf=0x7ffe5a46f010 "?", src_len=4294967295, pAdler32=0x0)
at /net/Tools/lzham_codec/lzhamdll/lzham_api.cpp:81
It seems I have a vector of capacity 2147487728, that size is to be increased to at least 2147603813, but next_pow2(2147603813)=4294967296 > UINT32_MAX=4294967295.
Is it a bug or should I use smaller chunks ?
This is on Linux x86_64 (Fedora 22, gcc 5.3.1), on an Intel Xeon CPU E5-2695 v3.
Thank you for you help.
Boris.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/richgel999/lzham_codec/issues/19, or mute the thread https://github.com/notifications/unsubscribe-auth/AGnRGc1Vd1udgJdsRoNK1rYAsFSCOKdwks5q62r8gaJpZM4Kpy2t .
Hi, There may be a bug in lzham_compress_memory(), I don't know.
However, I am wondering if elemental_vector should be changed to better handle large sizes. Currently, you have the following code:
bool elemental_vector::increase_capacity(uint min_new_capacity, bool grow_hint, uint element_size, object_mover pMover, bool nofail)
{
[...]
// new_capacity must be 64-bit when compiling on x64.
size_t new_capacity = (size_t)min_new_capacity;
if ((grow_hint) && (!math::is_power_of_2(static_cast<uint64>(new_capacity))))
new_capacity = static_cast<uint>(math::next_pow2(static_cast<uint64>(new_capacity)));
LZHAM_ASSERT(new_capacity && (new_capacity > m_capacity));
[...]
}
I think there should be a check before the assert (when grow_hint==true):
if (new_capacity < min_new_capacity)
return false;
The code could be changed to something like this:
bool elemental_vector::increase_capacity(uint min_new_capacity, bool grow_hint, uint element_size, object_mover pMover, bool nofail)
{
[...]
// new_capacity must be 64-bit when compiling on x64.
size_t new_capacity = (size_t)min_new_capacity;
if ((grow_hint) && (!math::is_power_of_2(static_cast<uint64>(new_capacity)))) {
new_capacity = static_cast<uint>(math::next_pow2(static_cast<uint64>(new_capacity)));
if (new_capacity < min_new_capacity) {
new_capacity = UINT32_MAX;
if (new_capacity < min_new_capacity)
return false;
}
}
LZHAM_ASSERT(new_capacity && (new_capacity > m_capacity));
[...]
}
Here, if we have the grow_hint set to true, and next power of two is greater than UINT32_MAX, we set the new_capacity to UINT32_MAX (so not a power of two...). With this code, the previous case (m_capacity=2147487728, min_new_capacity=2147603813, grow_hint=true) would not fail.
I have not tried to compress my buffer with this modification yet. I will ASAP.
Actually, setting new_capacity to UINT32_MAX does not work. Because when we call realloc, further in increase_capacity function, returned actual_size may be greater than desired_size=element_size*new_capacity. Then actual_size/element_size may not fit in an unsigned int.
I don't know how much bigger actual_size can be (it seems very implementation dependent according to malloc_usable_size man page). On my platform, if I set new_capacity to maximum UINT32_MAX-4096, I am able to compress my buffer.
So my code for elemental_vector looks like this:
bool elemental_vector::increase_capacity(uint min_new_capacity, bool grow_hint, uint element_size, object_mover pMover, bool nofail)
{
[...]
// new_capacity must be 64-bit when compiling on x64.
size_t new_capacity = (size_t)min_new_capacity;
if ((grow_hint) && (!math::is_power_of_2(static_cast<uint64>(new_capacity)))) {
new_capacity = static_cast<uint>(math::next_pow2(static_cast<uint64>(new_capacity)));
if (new_capacity < min_new_capacity) {
new_capacity = UINT32_MAX-4096;
if (new_capacity < min_new_capacity)
return false;
}
}
LZHAM_ASSERT(new_capacity && (new_capacity > m_capacity));
[...]
}
Besides, I think the two following asserts could be added at the end of elemental_vector::increase_capacity(), just before the return true:
LZHAM_ASSERT(min_new_capacity <= m_capacity );
LZHAM_ASSERT(m_size <= m_capacity);
Hi,
First of all, thank you for this excellent open source codec.
I am using LZHAM to compress a rather large buffer (6140582460 bytes). I compress by chunks of maximum UINT32_MAX (=4294967295) bytes , but it fails (on one particular buffer).
Is UINT32_MAX allowed for lzham_compress_memory() or is it too large ?
In release I have a core dumped. When compiled in Debug mode, I have the following backtrace:
It seems I have a vector of capacity 2147487728, that size is to be increased to at least 2147603813, but next_pow2(2147603813)=4294967296 > UINT32_MAX=4294967295.
Is it a bug or should I use smaller chunks ?
This is on Linux x86_64 (Fedora 22, gcc 5.3.1), on an Intel Xeon CPU E5-2695 v3.
Thank you for you help.
Boris.