RoaringBitmap / CBitmapCompetition

A comparison between different integer set techniques
Apache License 2.0
14 stars 6 forks source link

Defining DRECORD_MALLOCS can cause a Segmentation Fault on malloced_roaring_benchmarks (Sun Solaris) #3

Open ghost opened 6 years ago

ghost commented 6 years ago

Disclaimer I run an "old" gcc compiler so maybe that's the problem, but I still thought I would share.

Note, it seems like -DRECORD_MALLOCS causes a failure scenario on my system with the "malloced_roaring_benchmarks" if it is compiled as per the default in the Makefile.

malloced_roaring_benchmarks : src/roaring.c src/roaring_benchmarks.c
       $(CC) $(CFLAGS) -o malloced_roaring_benchmarks src/roaring_benchmarks.c -DRECORD_MALLOCS

If I remove the flag -DRECORD_MALLOCS from the Makefile the make file everything works.

malloced_roaring_benchmarks : src/roaring.c src/roaring_benchmarks.c
        $(CC) $(CFLAGS) -o malloced_roaring_benchmarks src/roaring_benchmarks.c

The following is what happens if I leave -DRECORD_MALLOCS active, e.g. the failure mode. On another note, should we actually doing benchmarks while recording allocated memory anyway by default ?

We experience a "Segmentation Fault" on the call to libc_malloc(...) on line 15 of this header file "src/cmemcounter.h" which is used to track memory - usage perhaps this is not entirely safe

void* malloc(size_t sz) {
    void *(*libc_malloc)(size_t) = dlsym(RTLD_NEXT, "malloc");
    void * answerplus =  libc_malloc(sz + sizeof(size_t) + sizeof(myalloc_cookie) );

My Server config (recent SmartOS VM with a zone limited to just 4GB, gcc 4.7.4):

# uname -a
SunOS gcc01 5.11 joyent_20180118T013028Z i86pc i386 i86pc Solaris

# gcc -v
Using built-in specs.
COLLECT_GCC=/opt/local/gcc47/bin/gcc
COLLECT_LTO_WRAPPER=/opt/local/gcc47/libexec/gcc/i486-sun-solaris2.11/4.7.4/lto-wrapper
Target: i486-sun-solaris2.11
Configured with: ../gcc-4.7.4/configure --enable-languages='c obj-c++ objc go fortran c++' --enable-shared --enable-long-long --with-local-prefix=/opt/local --enable-libssp --enable-threads=posix --with-boot-ldflags='-static-libstdc++ -static-libgcc -Wl,-R/opt/local/lib ' --disable-nls --with-gxx-include-dir=/opt/local/gcc47/include/c++/ --without-gnu-ld --with-ld=/usr/bin/ld --with-gnu-as --with-as=/opt/local/bin/gas --prefix=/opt/local/gcc47 --build=i486-sun-solaris2.11 --host=i486-sun-solaris2.11 --infodir=/opt/local/gcc47/info --mandir=/opt/local/gcc47/man
Thread model: posix
gcc version 4.7.4 (GCC)

# prtconf | head -3 | grep Mem
prtconf: devinfo facility not available
Memory size: 4096 Megabytes

In order to get things to compile (old gcc/g++ stack) I changed the following three (3) files:

vi ./Makefile 
alter both CFLAGS and CXXLAGS
remove: 
-Wno-deprecated-register
add:    
-m64
vi ./src/benchmark.h
add:
#include <getopt.h>
#ifdef __cplusplus
#include <stdexcept>
#endif
vi ./synthetic/anh_moffat_clustered.h
add:
#include <stdexcept>

It seems all the runs of ./malloced_roaring_benchmarks fail with a Segmentation Fault except one which hangs (or seems to hang) this exception is census1881_srt. I will only show what happens for the more typical Segmentation Fault case.

# ./malloced_roaring_benchmarks -r CRoaring/benchmarks/realdata/census-income
Segmentation Fault (core dumped)

Running the the same command above under truss

# truss ./malloced_roaring_benchmarks -r CRoaring/benchmarks/realdata/census-income |& tail -20
fstat(3, 0xFFFFFD7FFFDFF6C0)                    = 0
ioctl(3, TCGETA, 0xFFFFFD7FFFDFF740)            Err#25 ENOTTY
read(3, " 2 , 1 8 , 7 9 , 1 4 2 ,".., 44544)    = 44363
lseek(3, 0, SEEK_CUR)                           = 44363
close(3)                                        = 0
open("CRoaring/benchmarks/realdata/census-income/census-income.csv99.txt", O_RDONLY) = 3
lseek(3, 0, SEEK_END)                           = 64307
lseek(3, 0, SEEK_CUR)                           = 64307
lseek(3, 0, SEEK_CUR)                           = 64307
lseek(3, 0, SEEK_SET)                           = 0
fstat(3, 0xFFFFFD7FFFDFF790)                    = 0
fstat(3, 0xFFFFFD7FFFDFF6C0)                    = 0
ioctl(3, TCGETA, 0xFFFFFD7FFFDFF740)            Err#25 ENOTTY
read(3, " 2 6 , 4 0 , 4 7 , 5 1 ,".., 64512)    = 64307
lseek(3, 0, SEEK_CUR)                           = 64307
close(3)                                        = 0
    Incurred fault #6, FLTBOUNDS  %pc = 0xFFFFFD7FEF229C41
      siginfo: SIGSEGV SEGV_MAPERR addr=0xFFFFFD7FEF229C41
    Received signal #11, SIGSEGV [default]
      siginfo: SIGSEGV SEGV_MAPERR addr=0xFFFFFD7FEF229C41
[root@gcc01 /opt/jon/wrk/CBitmapCompetition]#

looking at the generated core file in this case "core.malloced_roaring.89085"

# file core.malloced_roaring.89085
core.malloced_roaring.89085:    ELF 64-bit LSB core file AMD64 Version 1, from 'malloced_roarin'
# ls -ltr core.malloced_roaring.89085
-rw------- 1 root root 32209219 Feb 15 20:15 core.malloced_roaring.89085
# pargs core.malloced_roaring.89085
core 'core.malloced_roaring.89085' of 89085:    ./malloced_roaring_benchmarks -r CRoaring/benchmarks/realdata/census-income
argv[0]: ./malloced_roaring_benchmarks
argv[1]: -r
argv[2]: CRoaring/benchmarks/realdata/census-income
# pstack core.malloced_roaring.89085
core 'core.malloced_roaring.89085' of 89085:    ./malloced_roaring_benchmarks -r CRoaring/benchmarks/realdata/census-i
 fffffd7fef229c41 t_splay (7c9390) + 11
 fffffd7fef229ad6 t_delete (7c9390) + 26
 fffffd7fef229801 realfree (7ad220) + 141
 fffffd7fef229f3a cleanfree (0) + 3a
 fffffd7fef229090 _malloc_unlocked (12) + 60
 fffffd7fef228ffb malloc (12) + 3b
 000000000040bcc5 malloc () + 25
 0000000000414f5a array_container_grow () + 6a
 000000000042014a roaring_bitmap_add_many () + e2a
 0000000000420300 roaring_bitmap_of_ptr () + 30
 000000000042d714 main () + 1374
 000000000040b933 _start_crt () + 83
 000000000040b898 _start () + 18
# pflags core.malloced_roaring.89085
core 'core.malloced_roaring.89085' of 89085:    ./malloced_roaring_benchmarks -r CRoaring/benchmarks/realdata/census-i
        data model = _LP64  flags = MSACCT|MSFORK
 /1:    flags = 0
        sigmask = 0xffffbefc,0xffffffff,0x000003ff
        cursig = SIGSEGV
# pldd core.malloced_roaring.89085
core 'core.malloced_roaring.89085' of 89085:    ./malloced_roaring_benchmarks -r CRoaring/benchmarks/realdata/census-i
/lib/amd64/libdl.so.1
/lib/amd64/libc.so.1
# pmap core.malloced_roaring.89085
core 'core.malloced_roaring.89085' of 89085:    ./malloced_roaring_benchmarks -r CRoaring/benchmarks/realdata/census-i
0000000000400000        204K r-x--  /opt/jon/wrk/CBitmapCompetition/malloced_roaring_benchmarks
0000000000442000         24K rw---  /opt/jon/wrk/CBitmapCompetition/malloced_roaring_benchmarks
0000000000448000      28512K rw---    [ heap ]
FFFFFD7FECB6F000          4K r-x--  /lib/amd64/libdl.so.1
FFFFFD7FEF130000         64K rwx--    [ anon ]
FFFFFD7FEF150000         24K rwx--    [ anon ]
FFFFFD7FEF160000          4K rwx--    [ anon ]
FFFFFD7FEF170000          4K rwx--    [ anon ]
FFFFFD7FEF180000       1548K r-x--  /lib/amd64/libc.so.1
FFFFFD7FEF313000         48K rw---  /lib/amd64/libc.so.1
FFFFFD7FEF31F000         16K rw---  /lib/amd64/libc.so.1
FFFFFD7FEF330000          4K rwx--    [ anon ]
FFFFFD7FEF340000          4K r----*   [ anon ]
FFFFFD7FEF350000          4K rwx--    [ anon ]
FFFFFD7FEF360000          4K rw---    [ anon ]
FFFFFD7FEF370000          4K rw---    [ anon ]
FFFFFD7FEF380000          4K rwx--    [ anon ]
FFFFFD7FEF390000          4K r----*   [ anon ]
FFFFFD7FEF397000        332K r-x--  /lib/amd64/ld.so.1
FFFFFD7FEF3FA000         12K rwx--  /lib/amd64/ld.so.1
FFFFFD7FEF3FD000          8K rwx--  /lib/amd64/ld.so.1
FFFFFD7FFFDFD000         12K rw---    [ stack ]
         total        30844K
lemire commented 6 years ago

Thank you for raising the issue.

lemire commented 6 years ago

What happens if you type make && make test?

I expect that the script will run through... reporting a few harmless core dumps. If you are not interested in benchmarking memory usage, you can simply ignore these failed tests.

I would expect that if type make test, the script will happily run through, despite the specific failures. You will simply not get memory-usage reports. If that's not the case, please report it... It would then be scripting bug.

In order to get things to compile (old gcc/g++ stack) I changed (...)-m64

Why would -m64 be necessary?

On another note, should we actually doing benchmarks while recording allocated memory anyway by default ?

We want to be benchmarking memory usage. I feel that it is important to try to compare fairly memory usage, as part of the benchmark. As I stated above, if you don't care about getting these numbers, just ignore this particular part of the benchmark.

(...) usage perhaps this is not entirely safe

It is not part of the C standard, for sure. But we need some way to track memory usage and I don't know a better way. As far as I can tell, what we do is the standard approach. I'd love to be corrected though!

Your stack trace seems to indicate that the problem occurs with the malloc function. Our malloc function is this (to track memory usage, we overload malloc):

void* malloc(size_t sz) {
    void *(*libc_malloc)(size_t) = dlsym(RTLD_NEXT, "malloc");
    void * answerplus =  libc_malloc(sz + sizeof(size_t) + sizeof(myalloc_cookie) );
    if(answerplus == NULL) return answerplus;// nothing can be done
    malloced_memory_usage += sz;
    memcpy(answerplus ,&myalloc_cookie,sizeof(myalloc_cookie));
    memcpy((char *) answerplus + sizeof(myalloc_cookie),&sz,sizeof(sz));
    return ((char *) answerplus) + sizeof(size_t) + sizeof(myalloc_cookie);
}

So what could go wrong?

Formally speaking, we should check that libc_malloc is not-null. But, really, if you have no malloc, life is not good for you. And that's not what seems to be happening.

The fact that you need -m64 suggests you have both 32-bit and 64-bit software, so this can create issues. Could you simplify this part?

It is possible that libc_malloc is not, actually, the libc malloc we expect. Maybe typing ldd ./malloced_roaring_benchmarks could help. You should get something like this...

$ ldd ./malloced_roaring_benchmarks
    linux-vdso.so.1 =>  (0x00007ffed835e000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f4c37635000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f4c3726c000)
    /lib64/ld-linux-x86-64.so.2 (0x00005588d2b92000)
ghost commented 6 years ago

What happens if you type make && make test?

Exactly as you say, I can of course ignore theses core dumps (or better yet modify all.sh).

Why would -m64 be necessary?

without -m64 the macro RDTSC_START in roaring_benchmarks.c causes gcc errors - most likely due to my older compiler (I will try a more recent one soon):

In file included from src/roaring_benchmarks.c:11:0:
src/roaring.c: In function 'bitset_set_list_withcard':
src/roaring.c:10009:5: error: can't find a register in class 'GENERAL_REGS' while reloading 'asm'
src/roaring.c:10009:5: error: 'asm' operand has impossible constraints
src/roaring_benchmarks.c: In function 'main':
src/roaring_benchmarks.c:130:5: error: PIC register clobbered by '%rbx' in 'asm'

It is not part of the C standard, for sure. But we need some way to track memory usage and I don't know a better way.

I have had good luck with dmalloc from http://dmalloc.com/ granted it really slows things down but it does a lot more than just "statistics". If you want I could remove the DRECORD_MALLOCS and try things with the dmalloc library I mention.

It is possible that libc_malloc is not, actually, the libc malloc we expect.

My ldd output

# ldd malloced_roaring_benchmarks
        libdl.so.1 =>    /lib/64/libdl.so.1
        libc.so.1 =>     /lib/64/libc.so.1
        libm.so.2 =>     /lib/64/libm.so.2
lemire commented 6 years ago

I have had good luck with dmalloc from http://dmalloc.com/ granted it really slows things down but it does a lot more than just "statistics". If you want I could remove the DRECORD_MALLOCS and try things with the dmalloc library I mention.

Yes! Pull Request invited!