sobelio / llm-chain

`llm-chain` is a powerful rust crate for building chains in large language models allowing you to summarise text and complete complex tasks
https://llm-chain.xyz
MIT License
1.3k stars 128 forks source link

Updated llama.cpp, don't segfault on missing file, and fixed OutputStream hanging forever #182

Closed andychenbruce closed 1 year ago

andychenbruce commented 1 year ago

Updated llama.cpp (twice), only things that needed to be changed were adding some fields to ContextParams.

llama_init_from_file returns a nullptr if it fails which includes if the file is not found, but the rust code just silently ignores it and then only segfaults when it starts to run the model. Now it checks for nullptr and returns a Result.

When I used llama.cpp it would sometimes freeze the whole thread when the LLM made a lot of text. This was because the mpsc was set to have a limit of 100 segments but when the LLM is accelerated a lot like by CUBLAS it produces tokens faster than they are processed and fills up the buffer, which then gets deadlocked or something on .send() method (inside the library, not the user code). I replaced the mpsc::channel(100) with mpsc::unbounded_channel() so it can handle unlimited but if that is a performance concern maybe increase 100 to 1000 or something.

So far it works fine for running 33B Vicuna at least on my end.

Juzov commented 1 year ago

LGTM, I think we can start like this