sobelio / llm-chain

`llm-chain` is a powerful rust crate for building chains in large language models allowing you to summarise text and complete complex tasks
https://llm-chain.xyz
MIT License
1.3k stars 128 forks source link

llama: update llama.cpp to latest version #244

Closed danbev closed 8 months ago

danbev commented 9 months ago

This commit updates llama.cpp to the latest/later version.

The motivation for this is that the current version of llama.cpp is a little outdated and there have been changes to the llama.cpp API and also the model format. Currently it is not possible to use the new GGUF format and many of the available models are in this new format which can make it challenging to use this crate at the moment.

The following changes have been made:


This is a work in progress but I wanted to open a draft pull request sooner rather than later to get some visibility and feedback.

Currently I've been able to successfully run the simple example, few_shot, and stream examples. ~The map_reduce_llama example is not working as of this writing which I'll look into further~.

williamhogman commented 9 months ago

<3

Juzov commented 9 months ago

There's a clause ignoring MaxTokens if MaxTokens == 0 (or rather the reverse). So adding MaxTokens to be equal to MaxContextSize in the examples is redundant. If you want, and its possible, you can try to alter the option to be as big as the context window by default and remove the clause.

danbev commented 9 months ago

So adding MaxTokens to be equal to MaxContextSize in the examples is redundant.

I added a MaxBatchSize option in https://github.com/sobelio/llm-chain/pull/244/commits/452ac2c6d282a95c0c1a038432ec52490c98fd1a, and it has default value of 512 which matches the value in llama.cpp, and have now removed the options from the examples (apart from simple_llama).

If you want, and its possible, you can try to alter the option to be as big as the context window by default and remove the clause.

I'm planning on taking a closer look at the model options today, and I'll also take another look at the context options and your suggestion. Thanks