SIGSEGV if Embed4All is called too many times

Zambrella commented 12 months ago

System Info

Macbook Pro M1 16GB RAM
Python 3.11.6
gpt4all==2.0.1

Information

[ ] The official example notebooks/scripts
[X] My own modified scripts

Reproduction

Description

I've chunked my document up where each chunk is no more than 500 words with a total of ~300 chunks. I am then looping over each chunk and using the default Embed4All() embedder. However, when it gets to about 30 chunks in, I get a python segmentation fault (which I understand is likely due to running out of memory). I can see my RAM usage increase throughout this process. It seems to me that the C/C++ running under the hood is holding onto the memory even after the embedding is complete? I have ran the exact same using OpenAI without error so I don't think the resultant embeddings are taking up the memory. I also know it's not a specific chunk as I've run the embeddings from different starting points.

I've tried using device="gpu" with no difference and I've also tried using a different embedder model.

Snippet

def get_embedding(text: str) -> list[float]:
    embedder = Embed4All(device="cpu", verbose=True)
    output = embedder.embed(text)
    return output

Error

18392 segmentation fault

Apple error report

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x0000000000000011
Exception Codes:       0x0000000000000001, 0x0000000000000011

Termination Reason:    Namespace SIGNAL, Code 11 Segmentation fault: 11
Terminating Process:   exc handler [59316]

VM Region Info: 0x11 is not in any region.  Bytes before following region: 4368498671
      REGION TYPE                    START - END         [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
      UNUSED SPACE AT START
--->  
      __TEXT                      104620000-104624000    [   16K] r-x/r-x SM=COW  .../MacOS/Python

Thread 0 Crashed::  Dispatch queue: com.apple.main-thread
0   libbert-default.dylib                  0x127f0cf20 ggml_new_tensor_impl + 296
1   libbert-default.dylib                  0x127f0d244 ggml_new_tensor_1d + 36
2   libbert-default.dylib                  0x127ef1118 bert_eval(bert_ctx*, int, int const*, int, float*) + 288
3   libbert-default.dylib                  0x127ef27d8 bert_load_from_file(char const*) + 4008
4   libbert-default.dylib                  0x127ef2db8 Bert::loadModel(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) + 36
5   libllmodel.dylib                       0x127707904 llmodel_loadModel + 148
6   libffi.dylib                           0x19bd8f050 ffi_call_SYSV + 80
7   libffi.dylib                           0x19bd97adc ffi_call_int + 1208
8   _ctypes.cpython-311-darwin.so          0x1064c8874 _ctypes_callproc + 788
9   _ctypes.cpython-311-darwin.so          0x1064c34c4 PyCFuncPtr_call + 220
10  Python                                 0x104f93e8c _PyObject_MakeTpCall + 128
11  Python                                 0x105071e6c _PyEval_EvalFrameDefault + 42120
12  Python                                 0x105076554 _PyEval_Vector + 116
13  Python                                 0x104f941a8 _PyObject_FastCallDictTstate + 208
14  Python                                 0x104ffe0e8 slot_tp_init + 188
15  Python                                 0x104ff6598 type_call + 136
16  Python                                 0x104f94ca4 _PyObject_Call + 124
17  Python                                 0x105073cac _PyEval_EvalFrameDefault + 49864
18  Python                                 0x105076554 _PyEval_Vector + 116
19  Python                                 0x104f941a8 _PyObject_FastCallDictTstate + 208
20  Python                                 0x104ffe0e8 slot_tp_init + 188
21  Python                                 0x104ff6598 type_call + 136
22  Python                                 0x104f93e8c _PyObject_MakeTpCall + 128
23  Python                                 0x105071e6c _PyEval_EvalFrameDefault + 42120
24  Python                                 0x105066da0 PyEval_EvalCode + 168
25  Python                                 0x1050bd7a0 run_eval_code_obj + 84
26  Python                                 0x1050bd704 run_mod + 112
27  Python                                 0x1050bd544 pyrun_file + 148
28  Python                                 0x1050bcf94 _PyRun_SimpleFileObject + 268
29  Python                                 0x1050bc92c _PyRun_AnyFileObject + 216
30  Python                                 0x1050d950c pymain_run_file_obj + 220
31  Python                                 0x1050d8e4c pymain_run_file + 72
32  Python                                 0x1050d872c Py_RunMain + 704
33  Python                                 0x1050d9868 Py_BytesMain + 40
34  dyld                                   0x18af31058 start + 2224

Expected behavior

The embedder can run many times without crashing in a single program.

Karan-IceApple commented 8 months ago

HI , i'm too having issue with Embedding API, it consumes heavy memory for each /embedding request, did you find any solutions? Kindly let me know

Tachyon5 commented 7 months ago

I'm having the same issue. adding comment to follow this thread.

nomic-ai / gpt4all