nmslib / hnswlib

Header-only C++/python library for fast approximate nearest neighbors
https://github.com/nmslib/hnswlib
Apache License 2.0
4.31k stars 633 forks source link

double free bug in init_index #467

Open D4rkD0g opened 1 year ago

D4rkD0g commented 1 year ago

Hi, the hnswlib will crashed when init index if the parameter if too big

import hnswlib

h = hnswlib.Index(space='l2', dim=1)
h.init_index(max_elements=1, ef_construction=200, M=2305843009213693951)
h.add_items([1], -1)

the backtrace

Starting program: /usr/bin/python3 df.py
[*] Failed to find objfile or not a valid file format: [Errno 2] No such file or directory: 'system-supplied DSO at 0x7ffff7fc1000'
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[*] Failed to find objfile or not a valid file format: [Errno 2] No such file or directory: '.gnu_debugdata for /usr/local/lib/python3.10/dist-packages/numpy/core/../../numpy.libs/libgfortran-040039e1.so.5.0.0'
[New Thread 0x7ffff3bff640 (LWP 1541)]
[New Thread 0x7ffff33fe640 (LWP 1542)]
[New Thread 0x7ffff0bfd640 (LWP 1543)]
[New Thread 0x7fffee3fc640 (LWP 1544)]
[New Thread 0x7fffe9bfb640 (LWP 1545)]
[New Thread 0x7fffe73fa640 (LWP 1546)]
[New Thread 0x7fffe4bf9640 (LWP 1547)]
double free or corruption (out)

Thread 1 "python3" received signal SIGABRT, Aborted.
__pthread_kill_implementation (no_tid=0x0, signo=0x6, threadid=0x7ffff7c4f1c0) at ./nptl/pthread_kill.c:44
44      ./nptl/pthread_kill.c: No such file or directory.
[ Legend: Modified register | Code | Heap | Stack | String ]
───────────────────────────────────────────────────────────────────────────────────────────────────────────── registers ────
$rax   : 0x0
$rbx   : 0x007ffff7c4f1c0  →  0x007ffff7c4f1c0  →  [loop detected]
$rcx   : 0x007ffff7ce6a7c  →  <pthread_kill+300> mov r13d, eax
$rdx   : 0x6
$rsp   : 0x007fffffffd4e0  →  0x007fffffffd680  →  0x007ffff757f680  →  0x00000000000012fd
$rbp   : 0x602
$rsi   : 0x602
$rdi   : 0x602
$rip   : 0x007ffff7ce6a7c  →  <pthread_kill+300> mov r13d, eax
$r8    : 0x007fffffffd5b0  →  0x00000000000020 (" "?)
$r9    : 0x0
$r10   : 0x8
$r11   : 0x246
$r12   : 0x6
$r13   : 0x16
$r14   : 0x1
$r15   : 0x1
$eflags: [ZERO carry PARITY adjust sign trap INTERRUPT direction overflow resume virtualx86 identification]
$cs: 0x33 $ss: 0x2b $ds: 0x00 $es: 0x00 $fs: 0x00 $gs: 0x00
───────────────────────────────────────────────────────────────────────────────────────────────────────────────── stack ────
0x007fffffffd4e0│+0x0000: 0x007fffffffd680  →  0x007ffff757f680  →  0x00000000000012fd   ← $rsp
0x007fffffffd4e8│+0x0008: 0x007ffff6782f80  →  0x0000000000000001
0x007fffffffd4f0│+0x0010: 0x00555555acab00  →  0x0000000000002e ("."?)
0x007fffffffd4f8│+0x0018: 0x007ffff6782f80  →  0x0000000000000001
0x007fffffffd500│+0x0020: 0x007ffff73ebfd0  →  0x0000000000000000
0x007fffffffd508│+0x0028: 0x00555555acc240  →  0x00000000000026 ("&"?)
0x007fffffffd510│+0x0030: 0x00555555b403c0  →  0x0000000000000000
0x007fffffffd518│+0x0038: 0x0055555568ad45  →  <PyIter_Next+21> mov r12, rax
─────────────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ────
   0x7ffff7ce6a73 <pthread_kill+291> mov    edi, eax
   0x7ffff7ce6a75 <pthread_kill+293> mov    eax, 0xea
   0x7ffff7ce6a7a <pthread_kill+298> syscall
 → 0x7ffff7ce6a7c <pthread_kill+300> mov    r13d, eax
   0x7ffff7ce6a7f <pthread_kill+303> neg    r13d
   0x7ffff7ce6a82 <pthread_kill+306> cmp    eax, 0xfffff000
   0x7ffff7ce6a87 <pthread_kill+311> mov    eax, 0x0
   0x7ffff7ce6a8c <pthread_kill+316> cmovbe r13d, eax
   0x7ffff7ce6a90 <pthread_kill+320> jmp    0x7ffff7ce6a02 <__GI___pthread_kill+178>
[!] Command 'context' failed to execute properly, reason: 'threads'
gef➤  bt
#0  __pthread_kill_implementation (no_tid=0x0, signo=0x6, threadid=0x7ffff7c4f1c0) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=0x6, threadid=0x7ffff7c4f1c0) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=0x7ffff7c4f1c0, signo=signo@entry=0x6) at ./nptl/pthread_kill.c:89
#3  0x00007ffff7c92476 in __GI_raise (sig=sig@entry=0x6) at ../sysdeps/posix/raise.c:26
#4  0x00007ffff7c787f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007ffff7cd96f6 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff7e2bb8c "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#6  0x00007ffff7cf0d7c in malloc_printerr (str=str@entry=0x7ffff7e2e7b0 "double free or corruption (out)") at ./malloc/malloc.c:5664
#7  0x00007ffff7cf2ef0 in _int_free (av=0x7ffff7e69c80 <main_arena>, p=0x555555c10410, have_lock=<optimized out>) at ./malloc/malloc.c:4588
#8  0x00007ffff7cf54d3 in __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3391
#9  0x00007ffff7218ced in hnswlib::HierarchicalNSW<float>::~HierarchicalNSW (this=0x555555bce8b0, __in_chrg=<optimized out>) at ./hnswlib/hnswalg.h:141
#10 0x00007ffff7219088 in hnswlib::HierarchicalNSW<float>::~HierarchicalNSW (this=0x555555bce8b0, __in_chrg=<optimized out>) at ./hnswlib/hnswalg.h:140
#11 Index<float, float>::~Index (this=0x555555c102f0, __in_chrg=<optimized out>) at ./python_bindings/bindings.cpp:188
#12 std::default_delete<Index<float, float> >::operator() (__ptr=0x555555c102f0, this=<optimized out>) at /usr/include/c++/11/bits/unique_ptr.h:85
#13 std::unique_ptr<Index<float, float>, std::default_delete<Index<float, float> > >::~unique_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/11/bits/unique_ptr.h:361
#14 pybind11::class_<Index<float, float>>::dealloc(pybind11::detail::value_and_holder&) (v_h=...) at /tmp/pip-build-env-e97etj8q/overlay/local/lib/python3.10/dist-packages/pybind11/include/pybind11/pybind11.h:1872
#15 0x00007ffff7232ded in pybind11::detail::clear_instance (self=0x7ffff727da30) at /tmp/pip-build-env-e97etj8q/overlay/local/lib/python3.10/dist-packages/pybind11/include/pybind11/detail/class.h:424
#16 0x00007ffff7233df5 in pybind11::detail::pybind11_object_dealloc (self=0x7ffff727da30) at /tmp/pip-build-env-e97etj8q/overlay/local/lib/python3.10/dist-packages/pybind11/include/pybind11/detail/class.h:457
#17 0x000055555568c8e1 in ?? ()
#18 0x000055555568c6dc in ?? ()
#19 0x00005555557bd1f4 in ?? ()
#20 0x000055555567e40f in ?? ()
#21 0x00005555557bcd06 in ?? ()
#22 0x00005555557b94f6 in Py_FinalizeEx ()
#23 0x00005555557a9193 in Py_RunMain ()
#24 0x000055555577f32d in Py_BytesMain ()
#25 0x00007ffff7c79d90 in __libc_start_call_main (main=main@entry=0x55555577f2f0, argc=argc@entry=0x2, argv=argv@entry=0x7fffffffdeb8) at ../sysdeps/nptl/libc_start_call_main.h:58
#26 0x00007ffff7c79e40 in __libc_start_main_impl (main=0x55555577f2f0, argc=0x2, argv=0x7fffffffdeb8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdea8) at ../csu/libc-start.c:392
#27 0x000055555577f225 in _start ()
yurymalkov commented 1 year ago

Hi @D4rkD0g, Thanks for reporting. I guess we can can M to, say 100k to avoid that.

carnil commented 1 year ago

Appears that this issue got CVE-2023-37365 assigned.

superlazyname commented 1 month ago

Hello, did this fix make it into a release yet? Some vulnerability scanners e.g. https://security.snyk.io/vuln/SNYK-PYTHON-HNSWLIB-5750284 are still flagging the package as vulnerable.