Segfault on llama_tokenize

JFronny commented 1 year ago

The example from the README seems to segfault whenever it tries to tokenize input (my prompt is How are you?). I have not modified the code except for setting NGpuLayers to 30 due to it previously running out of VRAM and changing the modelPath to my local path (file downloaded from TheBloke). I am using de.kherud:llama:1.1.2. The hserr*.log is attached. llama.cpp was built with:

export CPPFLAGS=""
export CFLAGS=""
export CXXFLAGS=""
cmake -B ./build -Wno-dev -S . -DCMAKE_INSTALL_PREFIX="${pkgdir}/usr" -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON
cmake --build ./build

hs_err_pid43263.log

Trying the same prompt and model with the server binary built by llama.cpp works fine btw, so I don't think that is the issue.

JFronny commented 1 year ago

This seems to be because of new changes in llama.cpp: I tried a build at https://github.com/ggerganov/llama.cpp/commit/feea179e9f9921e96e8fb1b8855d4a8f83682455 and that worked fine.

kherud commented 1 year ago

The previous version 1.1.2 was compatible with llama.cpp #b1204. I just released 1.1.3 which is now compatible with #b1256 and hopefully solves your issue. I will also very soon release version 2.0 of the binding which removes those compatibility problems.

JFronny commented 1 year ago

As you said, it seems to work now. Thanks!

kherud / java-llama.cpp

Segfault on llama_tokenize #7