r-lyeh-archived / bundle

:package: Bundle, an embeddable compression library: DEFLATE, LZMA, LZIP, BZIP2, ZPAQ, LZ4, ZSTD, BROTLI, BSC, CSC, BCM, MCM, ZMOLLY, ZLING, TANGELO, SHRINKER, CRUSH, LZJB and SHOCO streams in a ZIP file (C++03)(C++11)
zlib License
637 stars 89 forks source link

Segmentation fault with v2.0.1 #17

Closed mavam closed 9 years ago

mavam commented 9 years ago

I'm getting a segfault after having switched to the current master with my compression benchmark. I'm on OSX 10.10 with Clang 3.7. Here's a backtrace:

(lldb) process launch -i data/pcap/2009-M57-day11-18-10k.pcap
Process 45472 launched: '/Users/mavam/Dropbox/code/compbench/benchmark-debug' (x86_64)
Algorithm   Raw Packed  Unpacked    Compression Decompression
RAW 2772093 2772093 2772093 3743    1162
LZ4F    2772093 1643265 2772093 4821    837
MINIZ   2772093 1519520 2772093 150800  7096
LZIP    2772093 949119  2772093 526206  60906
LZMA20  2772093 952561  2772093 471057  58925
ZPAQ    2772093 943286  2772093 8675433 8734077
LZ4 2772093 1540706 2772093 59946   903
BROTLI9 2772093 963956  2772093 263584  6107
ZSTD    2772093 1003409 2772093 215553  2119
LZMA25  2772093 949055  2772093 571734  58700
BSC 2772093 1072946 2772093 347131  299024
BROTLI11    2772093 936971  2772093 5798909 10169
SHRINKER    2772093 1598614 2772093 6424    1650
CSC20   2772093 957509  2772093 454960  67649
ZSTDF   2772093 1223905 2772093 7042    2316
BCM 2772093 1220156 2772093 398050  530257
ZLING   2772093 1061986 2772093 72908   20161
MCM 2772093 928892  2772093 1263611 1157556
TANGELO 2772093 990425  2772093 1658184 1796580
benchmark-debug was compiled with optimization - stepping may behave oddly; variables may not be available.
Process 45472 stopped
* thread #1: tid = 0x11f6df, 0x00000001000e2553 benchmark-debug`ppm_model_t::current_o4(this=0x00007fff5fb3f170) + 419 at bundle.cpp:108648, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
    frame #0: 0x00000001000e2553 benchmark-debug`ppm_model_t::current_o4(this=0x00007fff5fb3f170) + 419 at bundle.cpp:108648 [opt]
   108645               }
   108646           } else {  // not found -- create new node for context
   108647               o4 = (sparse_model_t*)m_o4_allocator.alloc(sizeof(sparse_model_t));
-> 108648               memset(o4->m_symbols, 0, sizeof(o4->m_symbols));
   108649               o4->m_context = m_context;
   108650               o4->m_sum = 0;
   108651               o4->m_cnt = 0;
(lldb) bt
* thread #1: tid = 0x11f6df, 0x00000001000e2553 benchmark-debug`ppm_model_t::current_o4(this=0x00007fff5fb3f170) + 419 at bundle.cpp:108648, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
  * frame #0: 0x00000001000e2553 benchmark-debug`ppm_model_t::current_o4(this=0x00007fff5fb3f170) + 419 at bundle.cpp:108648 [opt]
    frame #1: 0x00000001000ced23 benchmark-debug`ppm_model_t::encode(this=0x00007fff5fb3f170, coder=0x00007fff5fb3efe0, c=212) + 35 at bundle.cpp:108665 [opt]
    frame #2: 0x0000000100094702 benchmark-debug`zmolly_encode(fdata=0x00007fff5fbbcbc0, fcomp0=0x00007fff5fbbcb20, block_size=<unavailable>) + 1602 at bundle.cpp:108871 [opt]
    frame #3: 0x0000000100097106 benchmark-debug`bundle::pack(q=<unavailable>, in=<unavailable>, inlen=2772093, out=<unavailable>, outlen=0x00007fff5fbfeac0) + 4854 at bundle.cpp:109853 [opt]
    frame #4: 0x0000000100001279 benchmark-debug`main + 222 at bundle.hpp:189 [opt]
    frame #5: 0x000000010000119b benchmark-debug`main [inlined] std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > bundle::pack<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >(q=20) + 18 at bundle.hpp:220 [opt]
    frame #6: 0x0000000100001189 benchmark-debug`main + 393 at benchmark.cpp:63 [opt]
    frame #7: 0x00007fff968ff5c9 libdyld.dylib`start + 1
(lldb) 

After switching to master, I only updated the existing libraries:

--- a/benchmark.cpp
+++ b/benchmark.cpp
@@ -49,9 +49,11 @@ auto main() -> int {
 #endif
   std::string buffer{std::istreambuf_iterator<char>{std::cin},
                       std::istreambuf_iterator<char>{}};
-  std::vector<unsigned> libs{RAW, SHOCO, LZ4F, MINIZ, LZIP, LZMA20, ZPAQ, LZ4,
-                             BROTLI9, ZSTD, LZMA25, BSC, BROTLI11, SHRINKER,
-                             CSC20};
+  std::vector<unsigned> libs{
+    RAW, SHOCO, LZ4F, MINIZ, LZIP, LZMA20, ZPAQ,
+    LZ4, BROTLI9, ZSTD, LZMA25, BSC, BROTLI11, SHRINKER,
+    CSC20, ZSTDF, BCM, ZLING, MCM, TANGELO, ZMOLLY
+  };
   auto to_mus = [](std::chrono::high_resolution_clock::duration d) {
     return std::chrono::duration_cast<std::chrono::microseconds>(d).count();
   };

Since I get output up to TANGELO, this could be an issue with ZMOLLY.

r-lyeh-archived commented 9 years ago

are you compiling this on 64-bit arch?

r-lyeh-archived commented 9 years ago

ah, it seems so. can you put zmolly in first place and give it a try? i am afraid the previous compressors might be leaking also, -O0 -g will help with the trace too

mavam commented 9 years ago

Putting ZMOLLY first and adding -O0 makes my benchmark program run through just fine. (I already had added -g to get the stack trace.) Then I've switched back to -O3 to see whether more aggressive optimization triggers the issue again. Yep...but...why?! So then I added -sanitize=address -fno-omit-frame-pointer to hope that ASan would trigger. ASan didn't fire, but with the frame pointer available, I get a better stack trace from LLDB:

(lldb) process launch -i data/pcap/2009-M57-day11-18-10k.pcap
Process 47681 launched: './benchmark-debug' (x86_64)
Algorithm   Raw Packed  Unpacked    Compression Decompression
benchmark-debug was compiled with optimization - stepping may behave oddly; variables may not be available.
Process 47681 stopped
* thread #1: tid = 0x124bd9, 0x00000001000e2553 benchmark-debug`ppm_model_t::current_o4(this=0x00007fff5fb3f1e0) + 419 at bundle.cpp:108648, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
    frame #0: 0x00000001000e2553 benchmark-debug`ppm_model_t::current_o4(this=0x00007fff5fb3f1e0) + 419 at bundle.cpp:108648 [opt]
   108645               }
   108646           } else {  // not found -- create new node for context
   108647               o4 = (sparse_model_t*)m_o4_allocator.alloc(sizeof(sparse_model_t));
-> 108648               memset(o4->m_symbols, 0, sizeof(o4->m_symbols));
   108649               o4->m_context = m_context;
   108650               o4->m_sum = 0;
   108651               o4->m_cnt = 0;
(lldb) bt
* thread #1: tid = 0x124bd9, 0x00000001000e2553 benchmark-debug`ppm_model_t::current_o4(this=0x00007fff5fb3f1e0) + 419 at bundle.cpp:108648, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
  * frame #0: 0x00000001000e2553 benchmark-debug`ppm_model_t::current_o4(this=0x00007fff5fb3f1e0) + 419 at bundle.cpp:108648 [opt]
    frame #1: 0x00000001000ced23 benchmark-debug`ppm_model_t::encode(this=0x00007fff5fb3f1e0, coder=0x00007fff5fb3f050, c=212) + 35 at bundle.cpp:108665 [opt]
    frame #2: 0x0000000100094702 benchmark-debug`zmolly_encode(fdata=0x00007fff5fbbcc30, fcomp0=0x00007fff5fbbcb90, block_size=<unavailable>) + 1602 at bundle.cpp:108871 [opt]
    frame #3: 0x0000000100097106 benchmark-debug`bundle::pack(q=<unavailable>, in=<unavailable>, inlen=2772093, out=<unavailable>, outlen=0x00007fff5fbfeb30) + 4854 at bundle.cpp:109853 [opt]
    frame #4: 0x0000000100001279 benchmark-debug`main + 222 at bundle.hpp:189 [opt]
    frame #5: 0x000000010000119b benchmark-debug`main [inlined] std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > bundle::pack<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >(q=20) + 18 at bundle.hpp:220 [opt]
    frame #6: 0x0000000100001189 benchmark-debug`main + 393 at benchmark.cpp:66 [opt]
    frame #7: 0x00007fff968ff5c9 libdyld.dylib`start + 1
    frame #8: 0x00007fff968ff5c9 libdyld.dylib`start + 1
(lldb) up 2
frame #2: 0x0000000100094702 benchmark-debug`zmolly_encode(fdata=0x00007fff5fbbcc30, fcomp0=0x00007fff5fbbcb90, block_size=<unavailable>) + 1602 at bundle.cpp:108871 [opt]
   108868                   ppm.m_sse_last_esc = 0;
   108869   
   108870               } else {  // encode a literal
-> 108871                   ppm.encode(&coder, ib[ibpos]);
   108872                   if (ib[ibpos] == escape) {
   108873                       ppm.update_context(escape);
   108874                       ppm.encode(&coder, 0); 

This is the full compiler command line:

c++ -std=c++11 -stdlib=libc++ -g -O3 -o -fsanitize=address -fno-omit-frame-pointer -o benchmark-debug benchmark.cpp bundle/bundle.cpp
r-lyeh-archived commented 9 years ago

can you put a check to verify that o4 is a valid pointer after the alloc() call ? something that works with -O3 like: if(!o4) { fprintf(stderr,"invalid alloc: ptr %p\n", o4); exit(-1); }

mavam commented 9 years ago

This doesn't make a difference, unfortunately. But valgrind found something:

==53299== Memcheck, a memory error detector
==53299== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==53299== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==53299== Command: ./benchmark-debug
==53299== 
Algorithm   Raw Packed  Unpacked    Compression Decompression
==53299== Thread 2:
==53299== Invalid read of size 8
==53299==    at 0x1000A6286: void* std::__1::__thread_proxy<std::__1::tuple<zmolly_encode(std::__1::basic_istream<char, std::__1::char_traits<char> >&, std::__1::basic_ostream<char, std::__1::char_traits<char> >&, int)::$_1, int*> >(void*) (bundle.cpp:108753)
==53299==    by 0x100A53059: _pthread_body (in /usr/lib/system/libsystem_pthread.dylib)
==53299==    by 0x100A52FD6: _pthread_start (in /usr/lib/system/libsystem_pthread.dylib)
==53299==    by 0x100A503EC: thread_start (in /usr/lib/system/libsystem_pthread.dylib)
==53299==  Address 0x104d7c038 is 8 bytes before a block of size 16,777,216 alloc'd
==53299==    at 0x100581EBB: malloc (in /usr/local/Cellar/valgrind/3.11.0/lib/valgrind/vgpreload_memcheck-amd64-darwin.so)
==53299==    by 0x1005C643D: operator new(unsigned long) (in /usr/lib/libc++.1.dylib)
==53299==    by 0x1000E7843: std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >::__append(unsigned long) (new:156)
==53299==    by 0x100094265: zmolly_encode(std::__1::basic_istream<char, std::__1::char_traits<char> >&, std::__1::basic_ostream<char, std::__1::char_traits<char> >&, int) (vector:1998)
==53299==    by 0x1000970E5: bundle::pack(unsigned int, void const*, unsigned long, void*, unsigned long&) (bundle.cpp:109854)
==53299==    by 0x100001258: main (bundle.hpp:189)
==53299== 
==53299== Invalid read of size 4
==53299==    at 0x1000A62AA: void* std::__1::__thread_proxy<std::__1::tuple<zmolly_encode(std::__1::basic_istream<char, std::__1::char_traits<char> >&, std::__1::basic_ostream<char, std::__1::char_traits<char> >&, int)::$_1, int*> >(void*) (bundle.cpp:108750)
==53299==    by 0x100A53059: _pthread_body (in /usr/lib/system/libsystem_pthread.dylib)
==53299==    by 0x100A52FD6: _pthread_start (in /usr/lib/system/libsystem_pthread.dylib)
==53299==    by 0x100A503EC: thread_start (in /usr/lib/system/libsystem_pthread.dylib)
==53299==  Address 0x104d7c03b is 5 bytes before a block of size 16,777,216 alloc'd
==53299==    at 0x100581EBB: malloc (in /usr/local/Cellar/valgrind/3.11.0/lib/valgrind/vgpreload_memcheck-amd64-darwin.so)
==53299==    by 0x1005C643D: operator new(unsigned long) (in /usr/lib/libc++.1.dylib)
==53299==    by 0x1000E7843: std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >::__append(unsigned long) (new:156)
==53299==    by 0x100094265: zmolly_encode(std::__1::basic_istream<char, std::__1::char_traits<char> >&, std::__1::basic_ostream<char, std::__1::char_traits<char> >&, int) (vector:1998)
==53299==    by 0x1000970E5: bundle::pack(unsigned int, void const*, unsigned long, void*, unsigned long&) (bundle.cpp:109854)
==53299==    by 0x100001258: main (bundle.hpp:189)
==53299== 
==53299== Invalid read of size 4
==53299==    at 0x1000A62C6: void* std::__1::__thread_proxy<std::__1::tuple<zmolly_encode(std::__1::basic_istream<char, std::__1::char_traits<char> >&, std::__1::basic_ostream<char, std::__1::char_traits<char> >&, int)::$_1, int*> >(void*) (bundle.cpp:108761)
==53299==    by 0x100A53059: _pthread_body (in /usr/lib/system/libsystem_pthread.dylib)
==53299==    by 0x100A52FD6: _pthread_start (in /usr/lib/system/libsystem_pthread.dylib)
==53299==    by 0x100A503EC: thread_start (in /usr/lib/system/libsystem_pthread.dylib)
==53299==  Address 0x104d7c03c is 4 bytes before a block of size 16,777,216 alloc'd
==53299==    at 0x100581EBB: malloc (in /usr/local/Cellar/valgrind/3.11.0/lib/valgrind/vgpreload_memcheck-amd64-darwin.so)
==53299==    by 0x1005C643D: operator new(unsigned long) (in /usr/lib/libc++.1.dylib)
==53299==    by 0x1000E7843: std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >::__append(unsigned long) (new:156)
==53299==    by 0x100094265: zmolly_encode(std::__1::basic_istream<char, std::__1::char_traits<char> >&, std::__1::basic_ostream<char, std::__1::char_traits<char> >&, int) (vector:1998)
==53299==    by 0x1000970E5: bundle::pack(unsigned int, void const*, unsigned long, void*, unsigned long&) (bundle.cpp:109854)
==53299==    by 0x100001258: main (bundle.hpp:189)
==53299== 
==53299== Invalid read of size 2
==53299==    at 0x1000A62EF: void* std::__1::__thread_proxy<std::__1::tuple<zmolly_encode(std::__1::basic_istream<char, std::__1::char_traits<char> >&, std::__1::basic_ostream<char, std::__1::char_traits<char> >&, int)::$_1, int*> >(void*) (bundle.cpp:108763)
==53299==    by 0x100A53059: _pthread_body (in /usr/lib/system/libsystem_pthread.dylib)
==53299==    by 0x100A52FD6: _pthread_start (in /usr/lib/system/libsystem_pthread.dylib)
==53299==    by 0x100A503EC: thread_start (in /usr/lib/system/libsystem_pthread.dylib)
==53299==  Address 0x104d7c03e is 2 bytes before a block of size 16,777,216 alloc'd
==53299==    at 0x100581EBB: malloc (in /usr/local/Cellar/valgrind/3.11.0/lib/valgrind/vgpreload_memcheck-amd64-darwin.so)
==53299==    by 0x1005C643D: operator new(unsigned long) (in /usr/lib/libc++.1.dylib)
==53299==    by 0x1000E7843: std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >::__append(unsigned long) (new:156)
==53299==    by 0x100094265: zmolly_encode(std::__1::basic_istream<char, std::__1::char_traits<char> >&, std::__1::basic_ostream<char, std::__1::char_traits<char> >&, int) (vector:1998)
==53299==    by 0x1000970E5: bundle::pack(unsigned int, void const*, unsigned long, void*, unsigned long&) (bundle.cpp:109854)
==53299==    by 0x100001258: main (bundle.hpp:189)
==53299== 
==53299== 
==53299== Process terminating with default action of signal 11 (SIGSEGV)
==53299==  General Protection Fault
==53299==    at 0x1000E2564: ppm_model_t::current_o4() (bundle.cpp:108649)
==53299==    by 0x1000CED02: ppm_model_t::encode(rc_encoder_t*, int) (bundle.cpp:108666)
==53299==    by 0x1000946E1: zmolly_encode(std::__1::basic_istream<char, std::__1::char_traits<char> >&, std::__1::basic_ostream<char, std::__1::char_traits<char> >&, int) (bundle.cpp:108872)
==53299==    by 0x1000970E5: bundle::pack(unsigned int, void const*, unsigned long, void*, unsigned long&) (bundle.cpp:109854)
==53299==    by 0x100001258: main (bundle.hpp:189)
==53299== 
==53299== HEAP SUMMARY:
==53299==     in use at exit: 71,650,009 bytes in 451 blocks
==53299==   total heap usage: 536 allocs, 85 frees, 74,801,601 bytes allocated
==53299== 
==53299== LEAK SUMMARY:
==53299==    definitely lost: 15,337,602 bytes in 9 blocks
==53299==    indirectly lost: 56,269,088 bytes in 3 blocks
==53299==      possibly lost: 0 bytes in 0 blocks
==53299==    still reachable: 8,264 bytes in 6 blocks
==53299==         suppressed: 35,055 bytes in 433 blocks
==53299== Rerun with --leak-check=full to see details of leaked memory
==53299== 
==53299== For counts of detected and suppressed errors, rerun with: -v
==53299== ERROR SUMMARY: 19 errors from 4 contexts (suppressed: 0 from 0)
[1]    53299 killed     valgrind ./benchmark-debug < data/pcap/2009-M57-day11-18-10k.pcap
valgrind ./benchmark-debug < data/pcap/2009-M57-day11-18-10k.pcap  49.96s user 0.27s system 99% cpu 50.371 total

Smells like memory corruption to me. Can you reproduce the issue, by the way?

r-lyeh-archived commented 9 years ago

Nope I cant (for now). I am going with two solutions for this: a) Im installing OSX on a virtual computer. Hope it works. b) Ive appended a "valgrind" lib to my leak detector to detect corrupt memory allocations too. Hope it works as well :D

r-lyeh-archived commented 9 years ago

I can finally reproduce it

r-lyeh-archived commented 9 years ago

hey there @mavam, can you take a look in v2.0.2 branch please?

mavam commented 9 years ago

Works for me, no more segfaults!

r-lyeh-archived commented 9 years ago

:+1: !

mavam commented 9 years ago

JFYI: I regenerated my benchmark plots at https://github.com/mavam/compbench. ZLING looks like a fast alternative to LZ4 with better compression ratio.