Open slavag opened 1 year ago
Hi, I read you are stuck in the same error : ggml_new_tensor_impl: not enough space in the context's memory pool (needed 16781920, available 10485760) Fatal Python error: Segmentation fault
Please note that I also using --model_path_llama=llama-2-7b-chat.ggmlv3.q4_1.bin Nobody reply even to me ,anyway I'm doing several trial and error test:
after first question I save and clear that chat and I submit a new question , it does not crash into segmentation fault and finally return a second reply ,save and clear the second and submitted also a third question . Save and clear the third and submitted even a fourth question. RAM usage was 6.94GB of 16 Gb
I see your hardware instruction: AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 |
are you using ARM too? My hardware is Orange pi 5, arm 8 core cpu with 16 Gb Ram AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | VSX = 0 |
NEON and ARM_FMA are instruction related to ARM cpu processors FP 16_VA is instruction related to NPU processor my orange pi 5 has Rockchip RK3588S it has also a NPU with 6Tops AI computing , but not still sure h2oGPT is using it
my post is: h2oGPT installed on Orange Pi 5 16Gb RAM operating system Armbian : Fatal Python error: Segmentation fault https://github.com/h2oai/h2ogpt/issues/742
@bluciano212 I'm using Mac M1 Max (Apple silicon), so I'm using metal complied llama
The actual error seems to be:
ggml_new_tensor_impl: not enough space in the context's memory pool (needed 13717376, available 10485760)
Seems to be issue in llama.cpp: https://github.com/ggerganov/llama.cpp/issues/52
Last suggestion is a bug in llama.cpp due to special characters in the prompt/text.
@pseudotensor Thanks, will monitor that llama.cpp issue.
Seems that special characters are not the issue, but size of context is.
@slavag Ok, are you able to reduce that some?
The actual error seems to be:
ggml_new_tensor_impl: not enough space in the context's memory pool (needed 13717376, available 10485760)
Seems to be issue in llama.cpp: ggerganov/llama.cpp#52
Last suggestion is a bug in llama.cpp due to special characters in the prompt/text.
so if I will use GPT4All Model ggml-gpt4all-j-v1.3-groovy.bin there will not be any ggml_new_tensor_impl: not enough space in the context's memory pool (needed 16781920, available 10485760) Fatal Python error: Segmentation fault Is it right? what will happen to my existing db : db_dir_UserData I have create until now with llama.cpp model ? will I loose it ? Or will my existing db : db_dir_UserData be integrate in the GPT4All Model ?
I don't recommend GPT4All models, they are quite bad. But it's possible the new quantization from llama.cpp will help, i.e. GGUFv2 .
None of the database stuff is affected when one changes the LLM. You'll lose nothing and not have to do anything if one tries a different LLM.
new quantization from llama.cpp will help, i.e. GGUFv2
thanks I have searched on https://huggingface.co/TheBloke ,but nothing , Do you have any suggestion where I can download and try the new quantization from llama.cpp GGUFv2 .
In principle you can do:
pip uninstall -y llama_cpp_python_cuda llama_cpp_python
# windows:
pip install https://github.com/jllllll/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.1.83+cu118-cp310-cp310-linux_x86_64.whl
# linux:
pip install https://github.com/jllllll/llama-cpp-python-cuBLAS-wheels/releases/download/textgen-webui/llama_cpp_python_cuda-0.1.83+cu118-cp310-cp310-win_amd64.whl
or some similar wheel from: https://github.com/jllllll/llama-cpp-python-cuBLAS-wheels/releases
This is instead of the current 0.1.73 version.
Then just use some GGUF model on TheBloke.
GGUF
ok thanks! but GGUF is just for GPU ,it's not possible to use on CPU
Hi, Just run a small prompt :
how can I list all EC2 instances in specific region using AWS CLI ?
And entire process is failed (it was working a few weeks ago with same db) :Startup log:
Please advise. Thanks