RediSearch / RediSearch

A query and indexing engine for Redis, providing secondary indexing, full-text search, vector similarity search and aggregations.
https://redis.io/docs/stack/search/
Other
5.42k stars 516 forks source link

Segmentation fault when adding a document #1362

Open kefahi opened 4 years ago

kefahi commented 4 years ago

System information and software versions

OS Alpine linux 3.12 (official docker)
Arch x86_64
Redis versions 5.0.9 and 6.0.5 (both show the same problem)
Redisearch tags v1.6.13, v1.8.1 (both show the same problem)

Steps to reproduce

  1. Compile inside alpine docker
    git clone --recursive https://github.com/RediSearch/RediSearch.git
    cd RedisSearch
    git checkout tags/v1.8.1
    make build

The build process completes smoothly without issues. (only couple of minor, apparently unrelated warnings).

  1. Run

    valgrind -v redis-server --loadmodule ./src/redisearch.so
  2. Create Index and add document

    redis-cli FT.CREATE myIdx SCHEMA id TEXT NOSTEM subpath TEXT NOSTEM SORTABLE shortname TEXT NOSTEM SORTABLE
    redis-cli FT.ADD myIdx one 1.0 LANGUAGE "english" FIELDS id "one" subpath "/posts/"  shortname "new start"

    The FT.ADD command simply causes redis to core dump with the following

==2485== 
==2485== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==2485==  General Protection Fault
==2485==    at 0x4D312FF: ForwardIndex_HandleToken (forward_index.c:179)
==2485==    by 0x4D3173B: forwardIndexTokenFunc (forward_index.c:230)
==2485==    by 0x4D27F7A: fulltextPreprocessor (document.c:412)
==2485==    by 0x4D28FF1: Document_AddToIndexes (document.c:576)
==2485==    by 0x4D291EC: AddDocumentCtx_Submit (document.c:320)
==2485==    by 0x4D29DEA: RS_AddDocument (document_add.c:202)
==2485==    by 0x4D2A0F3: doAddDocument.part.0 (document_add.c:264)
==2485==    by 0x197D7F: RedisModuleCommandDispatcher (in /usr/bin/redis-server)
==2485==    by 0x13FD48: call (in /usr/bin/redis-server)
==2485==    by 0x1406DC: processCommand (in /usr/bin/redis-server)
==2485==    by 0x149E97: processCommandAndResetClient (in /usr/bin/redis-server)
==2485==    by 0x14D6C3: processInputBuffer (in /usr/bin/redis-server)
==2485== 
==2485== HEAP SUMMARY:
==2485==     in use at exit: 640,552 bytes in 13,147 blocks
==2485==   total heap usage: 20,933 allocs, 7,786 frees, 8,848,955 bytes allocated
==2485== 
==2485== Searching for pointers to 13,147 not-freed blocks
==2485== Checked 13,368,184 bytes
==2485== 
==2485== LEAK SUMMARY:
==2485==    definitely lost: 1,459 bytes in 49 blocks
==2485==    indirectly lost: 256 bytes in 9 blocks
==2485==      possibly lost: 50,100 bytes in 673 blocks
==2485==    still reachable: 588,737 bytes in 12,416 blocks
==2485==                       of which reachable via heuristic:
==2485==                         length64           : 548,336 bytes in 11,928 blocks
==2485==         suppressed: 0 bytes in 0 blocks
==2485== Rerun with --leak-check=full to see details of leaked memory
==2485== 
==2485== ERROR SUMMARY: 180 errors from 2 contexts (suppressed: 0 from 0)
==2485== 
==2485== 36 errors in context 1 of 2:
==2485== Syscall param write(buf) points to unaddressable byte(s)
==2485==    at 0x4053923: ??? (in /lib/ld-musl-x86_64.so.1)
==2485==    by 0x4050CF3: ??? (in /lib/ld-musl-x86_64.so.1)
==2485==    by 0x4094D8B: ???
==2485==  Address 0x4a7e2ec is 0 bytes after a block of size 28 alloc'd
==2485==    at 0x489F72A: malloc (vg_replace_malloc.c:309)
==2485==    by 0x144FB1: zmalloc (in /usr/bin/redis-server)
==2485==    by 0x14C74D: setDeferredAggregateLen (in /usr/bin/redis-server)
==2485==    by 0x140A29: commandCommand (in /usr/bin/redis-server)
==2485==    by 0x13FD48: call (in /usr/bin/redis-server)
==2485==    by 0x1406DC: processCommand (in /usr/bin/redis-server)
==2485==    by 0x149E97: processCommandAndResetClient (in /usr/bin/redis-server)
==2485==    by 0x14D6C3: processInputBuffer (in /usr/bin/redis-server)
==2485==    by 0x1AC14D: ??? (in /usr/bin/redis-server)
==2485==    by 0x1AC58C: ??? (in /usr/bin/redis-server)
==2485==    by 0x13AAD3: aeProcessEvents (in /usr/bin/redis-server)
==2485==    by 0x13AD44: aeMain (in /usr/bin/redis-server)
==2485== 
==2485== 
==2485== 144 errors in context 2 of 2:
==2485== Invalid write of size 1
==2485==    at 0x14B12A: _addReplyProtoToList (in /usr/bin/redis-server)
==2485==    by 0x14B267: addReply (in /usr/bin/redis-server)
==2485==    by 0x140778: addReplyCommand (in /usr/bin/redis-server)
==2485==    by 0x140A29: commandCommand (in /usr/bin/redis-server)
==2485==    by 0x13FD48: call (in /usr/bin/redis-server)
==2485==    by 0x1406DC: processCommand (in /usr/bin/redis-server)
==2485==    by 0x149E97: processCommandAndResetClient (in /usr/bin/redis-server)
==2485==    by 0x14D6C3: processInputBuffer (in /usr/bin/redis-server)
==2485==    by 0x1AC14D: ??? (in /usr/bin/redis-server)
==2485==    by 0x1AC58C: ??? (in /usr/bin/redis-server)
==2485==    by 0x13AAD3: aeProcessEvents (in /usr/bin/redis-server)
==2485==    by 0x13AD44: aeMain (in /usr/bin/redis-server)
==2485==  Address 0x4a7e2ec is 0 bytes after a block of size 28 alloc'd
==2485==    at 0x489F72A: malloc (vg_replace_malloc.c:309)
==2485==    by 0x144FB1: zmalloc (in /usr/bin/redis-server)
==2485==    by 0x14C74D: setDeferredAggregateLen (in /usr/bin/redis-server)
==2485==    by 0x140A29: commandCommand (in /usr/bin/redis-server)
==2485==    by 0x13FD48: call (in /usr/bin/redis-server)
==2485==    by 0x1406DC: processCommand (in /usr/bin/redis-server)
==2485==    by 0x149E97: processCommandAndResetClient (in /usr/bin/redis-server)
==2485==    by 0x14D6C3: processInputBuffer (in /usr/bin/redis-server)
==2485==    by 0x1AC14D: ??? (in /usr/bin/redis-server)
==2485==    by 0x1AC58C: ??? (in /usr/bin/redis-server)
==2485==    by 0x13AAD3: aeProcessEvents (in /usr/bin/redis-server)
==2485==    by 0x13AD44: aeMain (in /usr/bin/redis-server)
==2485== 
==2485== ERROR SUMMARY: 180 errors from 2 contexts (suppressed: 0 from 0)
Segmentation fault
kefahi commented 4 years ago

I figured out the root cause - but not the solution yet -.

It turned out that the default compilation options on alpine forces the malloc library to libc.

$ redis-server -v
Redis server v=5.0.9 sha=869dcbdc:0 malloc=libc bits=64 build=5e0aa57c0bc626b1

While if i compile redis by hand without setting that it would use jemalloc

$ redis-server -v
Redis server v=6.0.6 sha=00000000:0 malloc=jemalloc-5.1.0 bits=64 build=770a2d00214189a6

I compile RediSearch without any malloc configuration so it defaults to jemalloc. When I run redisearch against the jemalloc compiled redis all run without any issues.

So the problem is specific to when we have mismatching malloc libs.

How can I direct RediSearch to compile with libc malloc instead, I haven't figured that out yet.

Without figuring that out (by at least having some hint on how to change the malloc used in RediSearch) it is always guaranteed that it will not work with the version of Redis that is compiled with an alternative malloc library.

ashtul commented 4 years ago

@kefahi please define REDIS_MODULE_TARGET so enable code in /src/rmalloc.h.

kefahi commented 4 years ago

@ashtul

Sorry I concluded too soon.

It turned out that this option is enabled per default in the CMakeLists.txt

OPTION(USE_REDIS_ALLOCATOR "Use redis allocator" ON)

which -on the same file- controls

IF(USE_REDIS_ALLOCATOR)
    ADD_DEFINITIONS(-DREDIS_MODULE_TARGET)
ENDIF ()

I verified this by looking at the exact compile proceedings

make build VERBOSE=1

...
cd /home/kefah/RediSearch/build/src/rmutil && /usr/bin/cc -DREDISMODULE_EXPERIMENTAL_API -DREDIS_MODULE_TARGET -DRS_GIT_SHA=\"45750997726c28a87022b046cfb207ddd28ad7b2\" -DRS_GIT_VERSION=\"v1.6.6-79-g45750997\" -D_GNU_SOURCE -I/home/kefah/Downloads/RediSearch/src  -Wall -Wno-unused-function -Wno-unused-variable -Wno-sign-compare -fPIC -Werror=implicit-function-declaration -pthread -fno-strict-aliasing -Werror=incompatible-pointer-types -std=gnu99 -O2 -g    -UNDEBUG -o CMakeFiles/test_periodic.dir/test_periodic.c.o   -c /home/kefah/Downloads/RediSearch/src/rmutil/test_periodic.c
...

So the -DREDIS_MODULE_TARGET flag is being use per default.

ashtul commented 4 years ago

@kefahi Can you try compiling redis with USE_JEMALLOC=no?

ashtul commented 3 years ago

@kefahi any news?