sobelio / llm-chain

`llm-chain` is a powerful rust crate for building chains in large language models allowing you to summarise text and complete complex tasks
https://llm-chain.xyz
MIT License
1.3k stars 128 forks source link

Opt for num_gpu_layers #176

Closed andychenbruce closed 1 year ago

andychenbruce commented 1 year ago

I didn't see any way to set the number of layers passed to the gpu for llama. I don't know rust so this might be the wrong way to do things.

After setting the enabling the cuda feature for llm-chain-llama-sys then setting NumGpuLayers option to any number above 0 cuda acceleration works perfectly for me and llama models run like 5x faster at 20 layers.

It doesn't seem to break anything as setting the option with cuda disabled just has no effect.