Closed Kekec852 closed 1 year ago
Hello, me. I have found the problem CPU is missing AVX 2 and FMA features and therefore it will not run unless you add to cuda/ggml.Dockerfile
line 24: RUN CMAKE_ARGS="-DLLAMA_CUBLAS=on -DLLAMA_AVX2=off -DLLAMA_FMA=off" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.78
Then it will run. Maybe this will be helpful to somebody else with old hardware. got 1 tokens / s on cpu and 1.8 tokes / s on gpu.
Hello, i'm trying to get models working with gpu without success (i have managed to get 7b and 13b working on CPU). The error i'm getting is very very very strange (running:
sudo ./run.sh --model 13b --with-cuda
):traps: python3[91882] trap invalid opcode ip:7ff310d6d72d sp:7fff5bdc7390 error:0 in libllama.so[7ff310d50000+76000]
and that is in kernel log on container i get no usbale output:
CPU i'm using:
GPU:
Everything is running in vm on freshly installed ubuntu 22.04.
I'm thinking that some component like
llama_cpp.server
is compiled in such way that is incompatible with some part of my setup, as referred with invalid op code.I will update this issue wile doing more debugging.