inikep / lzbench

lzbench is an in-memory benchmark of open-source LZ77/LZSS/LZMA compressors
885 stars 179 forks source link

LZMAT library could crash on compressing large files. #7

Closed xcrh closed 8 years ago

xcrh commented 8 years ago

Configuration: lzbench: commit 5937923 OS: Xubuntu 64-bit, 15.10. Compler: gcc version 5.2.1 20151010 (Ubuntu 5.2.1-22ubuntu2) Flags: defauts from makefile, only BUILD_SYSTEM=linux uncommented.

Description: Another day, another crash... to reproduce: 1) Build lzbench with default compile flags on similar configuration. 2) Try to compress some reasonably large file. 3) Take a look what happens, preferrably under debugger.

Result: Crash could occur in lzmat compressor if file is large enough. Files smaller than few hundreds kb or so do not cause crash, but attempt to compress file larger than megabyte causes crash in lzmat_encode. For example, attempt to compress lzmabench itself :) would fail.

Note: 1) This algo presents in -eall, so for almost all reasonably large files lzbench just dies in the middle. Which is kinda unfortunate. 2) It seems author's page is down and I failed to find reasonable way to file a bug.

Program received signal SIGSEGV, Segmentation fault.
0x000000000056e6da in lzmat_encode ()
(gdb) bt full
#0  0x000000000056e6da in lzmat_encode ()
No symbol table info available.
#1  0x00000000005ce28d in lzbench_lzmat_compress(char*, unsigned long, char*, unsigned long, unsigned long, unsigned long, unsigned long)
    ()
No symbol table info available.
#2  0x00000000005bfdff in lzbench_compress(long (*)(char*, unsigned long, char*, unsigned long, unsigned long, unsigned long, unsigned long), unsigned long, std::vector<unsigned long, std::allocator<unsigned long> >&, unsigned char*, unsigned long, unsigned char*, unsigned long, unsigned long, unsigned long, unsigned long) ()
No symbol table info available.
#3  0x00000000005c1661 in lzbench_test(compressor_desc_t const*, int, int, unsigned long, int, unsigned char*, unsigned long, unsigned char*, unsigned long, unsigned char*, timespec, unsigned long, unsigned long, unsigned long) [clone .constprop.142] ()
No symbol table info available.
#4  0x00000000005c203d in lzbench_test_with_params(char*, int, unsigned long, int, unsigned char*, unsigned long, unsigned char*, unsigned long, unsigned char*, timespec) ()
No symbol table info available.
#5  0x00000000005c1e0f in lzbench_test_with_params(char*, int, unsigned long, int, unsigned char*, unsigned long, unsigned char*, unsigned long, unsigned char*, timespec) ()
No symbol table info available.
#6  0x00000000005c2422 in lzbenchmark(_IO_FILE*, char*, int, unsigned int, int) ()
No symbol table info available.
#7  0x0000000000401267 in main ()
No symbol table info available.
inikep commented 8 years ago

If you are using debugger you can compile lzbench with "BUILD_TYPE=debug" (sometimes errors disappear after this). I couldn't reproduce your SEGFAULT. In my experiments (Ubuntu 15.04 with gcc 4.9.2 on VirtualBox) I only found decompression error of lzmat. So far I moved "lzmat" to the end of a compressors list and now it will be compiled with -O2 option. I commited all changes to github so you can try if it helps: https://github.com/inikep/lzbench/tree/dev

xcrh commented 8 years ago

(sometimes errors disappear after this).

Unfortunately that was a case. I.e. it works perfectly with BUILD_TYPE=debug. Trying to build with -O3 and debug info...

...done. It requires -O3 and compiler mentioned version. Not triggered in debug builds. Also -O2 is okay. So it just does not coexists well with -O3.

Backtrace seems to end somewhere in copy loop.

Program received signal SIGSEGV, Segmentation fault.
lzmat_encode (pbOut=pbOut@entry=0x7ffff5413010 "\177", pcbOut=pcbOut@entry=0x7fffffffd8b4, 
    pbIn=pbIn@entry=0x7ffff6bb6010 "\177ELF\002\001\001\003", cbIn=cbIn@entry=21229040) at lzmat/lzmat_enc.c:376
376                     *pdwOut++ = *pdwIn++;
(gdb) bt full
#0  lzmat_encode (pbOut=pbOut@entry=0x7ffff5413010 "\177", pcbOut=pcbOut@entry=0x7fffffffd8b4, 
    pbIn=pbIn@entry=0x7ffff6bb6010 "\177ELF\002\001\001\003", cbIn=cbIn@entry=21229040) at lzmat/lzmat_enc.c:376
        pdwIn = <optimized out>
        pdwOut = <optimized out>
        cbCopy = 346
        pITmp = <optimized out>
        store_dist = <optimized out>
        hash_Idx = <optimized out>
        i = <optimized out>
        match_cnt = <optimized out>
        inPtr = 3301433
        cpy_tag = <optimized out>
        cbUCData = 2774
        pOut = <optimized out>
        pTag = 0x7ffff557fd45 "\037\250\257M˟ŏ\245\005\333\017\203x\002"
        pEndOut = 0x7ffff6bb5adc ""
        pInp = <optimized out>
        Gamma_dist = 2262
        bit_msk = 0 '\000'
        ThisTag = 1 '\001'
        cur_nib = <optimized out>
        tag_nib = 0 '\000'
        uc_nib = <optimized out>
        pUC_Tag = 0x7ffff557f11d "a\016\356\272Q\t\231\031H\257\006W\003P:\226>ZI\346)\003\200\270\355@\212˝\347\001\220^\r\216\230-}\271\002\300d\233\320\313\027\353w"
        processed_data = <optimized out>
        lzh = {ptr = {3300705, 3289840, 3289844, 3289848, 3290060, 3289868, 3289860, 3300666, 3289037, 3270325, 3278625, 3269844, 3284539, 
            3269181, 3269220, 3284322, 3280617, 3257328, 3246698, 3231049, 3272471, 3263523, 3273130, 3234486, 3280044, 3245276, 3298625, 
            3267337, 3280220, 3281649, 3268472, 3243452, 3279566, 3281203, 3271188, 3233296, 3269146, 3280566, 3279713, 3299214, 3280771, 
            3240274, 3279012, 3240882, 3259938, 3299679, 3256607, 3245682, 3298611, 3266938, 3290068, 3239538, 3247729, 3256618, 3265603, 
            3298900, 3232626, 3260049, 3278723, 3240624, 3236770, 3259848, 3278528, 3234237, 3290114, 3298602, 3260549, 3242998, 3288701, 
            3265645, 3259476, 3274057, 3281694, 3283689, 3273770, 3266526, 3262675, 3261233, 3243073, 3273565, 3234373, 3273716, 3284626, 
            3279519, 3257379, 3241860, 3252075, 3232738, 3240160, 3240056, 3233090, 3299119, 3232362, 3244535, 3238538, 3288378, 3298691, 
            3288906, 3299227, 3235439, 3280284, 3244421, 3236695, 3236461, 3268892, 3285682, 3234769, 3277587, 3211179, 3244365, 3299144, 
---Type <return> to continue, or q <return> to quit---
            3268424, 3232555, 3195819, 3233997, 3256339, 3279834, 3182372, 3231835, 3232332, 3270554, 3231724, 3280833, 3186940, 3279847, 
            3299787, 3281702, 3278529, 3289838, 3290271, 3283341, 3261179, 3288730, 3273415, 3233035, 3281728, 3279830, 3300075, 3288952, 
            3298770, 3280229, 3195305, 3288446, 3287697, 3288918, 3229839, 3281877, 3300703, 3301439, 3281393, 3279918, 3274929, 3280201, 
            3209363, 3225439, 3259315, 3287767, 3264719, 3265118, 3212597, 3298733, 3267394, 3257675, 3281021, 3267208, 3250933, 3278080, 
            3218166, 3284347, 3171667, 3261442, 3300838, 3300647, 3275969, 3170639, 3273261, 3284314, 3279970, 3244815, 3258109, 3279856, 
            3232246, 3261793, 3260528, 3284307, 3286510, 3228548, 3284632, 3246472, 3258738, 3269551, 3272856, 3272149, 3288395, 3277825, 
            3289104, 3278001, 3278095, 3269464, 3288941...}, idx = {3268122, 3273853, 3274467, 3273636, 3264697, 3222933, 3244654, 
            3228462, 3276584, 3276779, 3225736, 3274335, 3275108, 3229470, 3245578, 3154331, 3245736, 3238067, 3275581, 3221267, 3251000, 
            2617131, 3038485, 3062468, 3271427, 3275036, 3257341, 3244329, 3276698, 3273796, 3265121, 3269261, 3217870, 3274194, 3276790, 
            3276791, 3252930, 3209849, 3263247, 3276795, 3276796, 3266676, 3168382, 3206220, 3269391, 3276815, 3276816, 2575693, 3158449, 
            3258147, 3247346, 3272528, 3273278, 2960049, 3266696, 3214163, 3267106, 3252931, 3269196, 3165062, 2746336, 3182919, 3240196, 
            3231531, 3204897, 3270194, 3276828, 3276840, 3273895, 3273896, 3257201, 3020143, 3022626, 2516729, 3276708, 3276238, 3276780, 
            3252501, 2871625, 3248959, 3184525, 3206057, 2594624, 3207316, 3264453, 3273914, 3272953, 3270523, 3269886, 3275619, 3275398, 
            3276379, 3267852, 3276869, 3267226, 3245052, 3276177, 3256929, 3276874, 3276709, 3179780, 3122399, 3201409, 3254410, 3272460, 
            3276493, 3276600, 3180823, 3276416, 2623787, 3172537, 3265390, 3264737, 3237456, 3276498, 3276499, 3276500, 3275127, 3275662, 
            3268508, 3268509, 3177592, 3202627, 3190897, 3276402, 3276403, 3263186, 3263187, 3217815, 3244318, 3266230, 3227563, 3270386, 
            3276881, 3206058, 3206059, 3276886, 3276887, 3266466, 3274642, 3274643, 3273171, 3264810, 3192975, 3227898, 3247543, 3276478, 
            3271529, 3274831, 3274287, 1314977, 3237212, 3214051, 3263687, 3276839, 3276866, 3276809, 3276876, 3275246, 3227919, 3265498, 
            3272919, 3276219, 3276220, 3167185, 3200783, 3269763, 3274633, 3274542, 3163540, 3245308, 3201989, 3209800, 3276898, 3274697, 
            3267274, 3272976, 3224181, 3244008, 3123026, 3274011, 3276261, 3276549, 3276173, 3276174, 3229072, 3229073, 3266918, 3276176, 
            3276896, 3276178, 3276179, 3276180, 3276181, 3276182, 3276183, 3276184, 3276185, 3276186, 3276187...}}
#1  0x00000000005cf1dd in lzbench_lzmat_compress (inbuf=inbuf@entry=0x7ffff6bb6010 "\177ELF\002\001\001\003", 
    insize=insize@entry=21229040, outbuf=outbuf@entry=0x7ffff5413010 "\177", outsize=outsize@entry=24783597, level=level@entry=0)
    at _lzbench/compressors.cpp:549
        complen = 24783597
#2  0x00000000005c0d4f in lzbench_compress (
    compress=0x5cf1b0 <lzbench_lzmat_compress(char*, unsigned long, char*, unsigned long, unsigned long, unsigned long, unsigned long)>, 
    chunk_size=chunk_size@entry=21229040, compr_lens=..., inbuf=inbuf@entry=0x7ffff6bb6010 "\177ELF\002\001\001\003", insize=21229040, 
    outbuf=outbuf@entry=0x7ffff5413010 "\177", outsize=24783597, param1=0, param2=0, param3=0) at _lzbench/lzbench.cpp:188
        clen = <optimized out>
        part = 21229040
        sum = 0
        start = 0x7ffff6bb6010 "\177ELF\002\001\001\003"
#3  0x00000000005c25b1 in lzbench_test (desc=desc@entry=0xaaad90 <comp_desc+720>, level=level@entry=0, cspeed=cspeed@entry=0, 
---Type <return> to continue, or q <return> to quit---
    chunk_size=chunk_size@entry=21229040, iters=iters@entry=5, inbuf=inbuf@entry=0x7ffff6bb6010 "\177ELF\002\001\001\003", 
    insize=21229040, compbuf=0x7ffff5413010 "\177", comprsize=24783597, decomp=0x7ffff3fd0010 "\177ELF\002\001\001\003", param1=0, 
    ticksPerSecond=..., param3=0, param2=0) at _lzbench/lzbench.cpp:273
        milisec = <optimized out>
        ii = 0
        start_ticks = {tv_sec = 1447707084, tv_nsec = 404777552}
        mid_ticks = {tv_sec = 0, tv_nsec = 0}
        end_ticks = {tv_sec = 0, tv_nsec = 4}
        complen = 0
        ctime = {<std::_Vector_base<unsigned int, std::allocator<unsigned int> >> = {
            _M_impl = {<std::allocator<unsigned int>> = {<__gnu_cxx::new_allocator<unsigned int>> = {<No data fields>}, <No data fields>}, 
              _M_start = 0x0, _M_finish = 0x0, _M_end_of_storage = 0x0}}, <No data fields>}
        dtime = {<std::_Vector_base<unsigned int, std::allocator<unsigned int> >> = {
            _M_impl = {<std::allocator<unsigned int>> = {<__gnu_cxx::new_allocator<unsigned int>> = {<No data fields>}, <No data fields>}, 
              _M_start = 0x0, _M_finish = 0x0, _M_end_of_storage = 0x0}}, <No data fields>}
        compr_lens = {<std::_Vector_base<unsigned long, std::allocator<unsigned long> >> = {
            _M_impl = {<std::allocator<unsigned long>> = {<__gnu_cxx::new_allocator<unsigned long>> = {<No data fields>}, <No data fields>}, _M_start = 0x0, _M_finish = 0x0, _M_end_of_storage = 0x0}}, <No data fields>}
        decomp_error = false
#4  0x00000000005c2f8d in lzbench_test_with_params (namesWithParams=<optimized out>, cspeed=0, chunk_size=21229040, iters=5, 
    inbuf=0x7ffff6bb6010 "\177ELF\002\001\001\003", insize=21229040, compbuf=0x7ffff5413010 "\177", comprsize=24783597, 
    decomp=0x7ffff3fd0010 "\177ELF\002\001\001\003", ticksPerSecond=...) at _lzbench/lzbench.cpp:377
        level = 0
        i = <optimized out>
        found = true
        delimiters = "/"
        delimiters2 = ","
        copy = 0x5d5faa0 "lzmat"
        copy2 = 0x5d5fa80 "lzmat"
        token = <optimized out>
        token2 = <optimized out>
        token3 = <optimized out>
        save_ptr = 0x5d5faa5 ""
        save_ptr2 = 0x5d5fa85 ""
---Type <return> to continue, or q <return> to quit---
#5  0x00000000005c3372 in lzbenchmark (in=in@entry=0x5d5f830, encoder_list=encoder_list@entry=0x5d5f810 "lzmat", iters=iters@entry=5, 
    chunk_size=21229040, chunk_size@entry=2147483648, cspeed=cspeed@entry=0) at _lzbench/lzbench.cpp:440
        ctime = {<std::_Vector_base<unsigned int, std::allocator<unsigned int> >> = {
            _M_impl = {<std::allocator<unsigned int>> = {<__gnu_cxx::new_allocator<unsigned int>> = {<No data fields>}, <No data fields>}, 
              _M_start = 0x5d5fac0, _M_finish = 0x5d5fac0, _M_end_of_storage = 0x5d5fae0}}, <No data fields>}
        dtime = {<std::_Vector_base<unsigned int, std::allocator<unsigned int> >> = {
            _M_impl = {<std::allocator<unsigned int>> = {<__gnu_cxx::new_allocator<unsigned int>> = {<No data fields>}, <No data fields>}, 
              _M_start = 0x5d5faf0, _M_finish = 0x5d5faf0, _M_end_of_storage = 0x5d5fb10}}, <No data fields>}
        start_ticks = {tv_sec = 1447707084, tv_nsec = 380808023}
        mid_ticks = {tv_sec = 1447707084, tv_nsec = 392723226}
        end_ticks = {tv_sec = 1447707084, tv_nsec = 404648485}
        comprsize = <optimized out>
        insize = <optimized out>
        inbuf = 0x7ffff6bb6010 "\177ELF\002\001\001\003"
        compbuf = 0x7ffff5413010 "\177"
        decomp = 0x7ffff3fd0010 "\177ELF\002\001\001\003"
#6  0x00000000004012c7 in main (argc=<optimized out>, argv=0x7fffffffdf40) at _lzbench/lzbench.cpp:578
        in = 0x5d5f830
        iterations = <optimized out>
        chunk_size = <optimized out>
        cspeed = <optimized out>
        encoder_list = <optimized out>
        sort_col = <optimized out>
inikep commented 8 years ago

Thanks for testing. I will leave -O2 for lzmat and wflz. They will be also in the end of a compressor list just in case.