nomic-ai / gpt4all

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
https://nomic.ai/gpt4all
MIT License
70.15k stars 7.67k forks source link

SIGSEGV if Embed4All is called too many times #1583

Open Zambrella opened 12 months ago

Zambrella commented 12 months ago

System Info

Information

Reproduction

Description

I've chunked my document up where each chunk is no more than 500 words with a total of ~300 chunks. I am then looping over each chunk and using the default Embed4All() embedder. However, when it gets to about 30 chunks in, I get a python segmentation fault (which I understand is likely due to running out of memory). I can see my RAM usage increase throughout this process. It seems to me that the C/C++ running under the hood is holding onto the memory even after the embedding is complete? I have ran the exact same using OpenAI without error so I don't think the resultant embeddings are taking up the memory. I also know it's not a specific chunk as I've run the embeddings from different starting points.

I've tried using device="gpu" with no difference and I've also tried using a different embedder model.

Snippet

def get_embedding(text: str) -> list[float]:
    embedder = Embed4All(device="cpu", verbose=True)
    output = embedder.embed(text)
    return output

Error

18392 segmentation fault

Apple error report

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x0000000000000011
Exception Codes:       0x0000000000000001, 0x0000000000000011

Termination Reason:    Namespace SIGNAL, Code 11 Segmentation fault: 11
Terminating Process:   exc handler [59316]

VM Region Info: 0x11 is not in any region.  Bytes before following region: 4368498671
      REGION TYPE                    START - END         [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
      UNUSED SPACE AT START
--->  
      __TEXT                      104620000-104624000    [   16K] r-x/r-x SM=COW  .../MacOS/Python

Thread 0 Crashed::  Dispatch queue: com.apple.main-thread
0   libbert-default.dylib                  0x127f0cf20 ggml_new_tensor_impl + 296
1   libbert-default.dylib                  0x127f0d244 ggml_new_tensor_1d + 36
2   libbert-default.dylib                  0x127ef1118 bert_eval(bert_ctx*, int, int const*, int, float*) + 288
3   libbert-default.dylib                  0x127ef27d8 bert_load_from_file(char const*) + 4008
4   libbert-default.dylib                  0x127ef2db8 Bert::loadModel(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) + 36
5   libllmodel.dylib                       0x127707904 llmodel_loadModel + 148
6   libffi.dylib                           0x19bd8f050 ffi_call_SYSV + 80
7   libffi.dylib                           0x19bd97adc ffi_call_int + 1208
8   _ctypes.cpython-311-darwin.so          0x1064c8874 _ctypes_callproc + 788
9   _ctypes.cpython-311-darwin.so          0x1064c34c4 PyCFuncPtr_call + 220
10  Python                                 0x104f93e8c _PyObject_MakeTpCall + 128
11  Python                                 0x105071e6c _PyEval_EvalFrameDefault + 42120
12  Python                                 0x105076554 _PyEval_Vector + 116
13  Python                                 0x104f941a8 _PyObject_FastCallDictTstate + 208
14  Python                                 0x104ffe0e8 slot_tp_init + 188
15  Python                                 0x104ff6598 type_call + 136
16  Python                                 0x104f94ca4 _PyObject_Call + 124
17  Python                                 0x105073cac _PyEval_EvalFrameDefault + 49864
18  Python                                 0x105076554 _PyEval_Vector + 116
19  Python                                 0x104f941a8 _PyObject_FastCallDictTstate + 208
20  Python                                 0x104ffe0e8 slot_tp_init + 188
21  Python                                 0x104ff6598 type_call + 136
22  Python                                 0x104f93e8c _PyObject_MakeTpCall + 128
23  Python                                 0x105071e6c _PyEval_EvalFrameDefault + 42120
24  Python                                 0x105066da0 PyEval_EvalCode + 168
25  Python                                 0x1050bd7a0 run_eval_code_obj + 84
26  Python                                 0x1050bd704 run_mod + 112
27  Python                                 0x1050bd544 pyrun_file + 148
28  Python                                 0x1050bcf94 _PyRun_SimpleFileObject + 268
29  Python                                 0x1050bc92c _PyRun_AnyFileObject + 216
30  Python                                 0x1050d950c pymain_run_file_obj + 220
31  Python                                 0x1050d8e4c pymain_run_file + 72
32  Python                                 0x1050d872c Py_RunMain + 704
33  Python                                 0x1050d9868 Py_BytesMain + 40
34  dyld                                   0x18af31058 start + 2224

Expected behavior

The embedder can run many times without crashing in a single program.

Karan-IceApple commented 8 months ago

HI , i'm too having issue with Embedding API, it consumes heavy memory for each /embedding request, did you find any solutions? Kindly let me know

Tachyon5 commented 7 months ago

I'm having the same issue. adding comment to follow this thread.