marcom / Llama.jl

Julia interface to llama.cpp, a C/C++ library for running language models
MIT License
25 stars 2 forks source link

`run_chat` cannot be interrupted with CTRL+C on MacOS #7

Open svilupp opened 9 months ago

svilupp commented 9 months ago

Expectation: When I run run_chat, I'd like to terminate the interactive session with CTRL+C (as per the llama.cpp manual).

Problem: When I press CTRL+C, the interrupt control sequence gets consumed by REPL and is not emitted. Ie, I cannot stop it and have to restart the REPL session

MWE

using Llama

model = "/Users/simljx/Documents/llama.cpp/models/rocket-3b-2.76bpw.gguf"
Llama.run_chat(; model, prompt="Say hi!", nthreads=1)
# press CTRL+C to terminate

Versions

julia> versioninfo() Julia Version 1.10.0 Commit 3120989f39b (2023-12-25 18:01 UTC) Build Info: Official https://julialang.org/ release Platform Info: OS: macOS (arm64-apple-darwin22.4.0) CPU: 8 × Apple M1 Pro WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-15.0.7 (ORCJIT, apple-m1) Threads: 8 on 6 virtual cores Environment: JULIA_EDITOR = code JULIA_NUM_THREADS = 8

marcom commented 9 months ago

Disabling julia's SIGINT (this signal is sent by the shell when pressing Ctrl-C) handler seems to do the trick for me:

disable_sigint() do
    run_chat(model="./models/rocket-3b-2.76bpw.gguf", prompt="Say hi!")
end

First Ctrl-C interrupts llama.cpp, second Ctrl-C then returns to the julia REPL.

This should probably be added to both run_chat and run_llama.

marcom commented 9 months ago

11 fixed this on linux, but macos seems to still have this problem.

@svilupp Can you try with and without gpu on macos on current main branch? And perhaps also 1 CPU thread vs multi-threaded