teilomillet / gollm

Unified Go interface for Language Model (LLM) providers. Simplifies LLM integration with flexible prompt management and common task functions.
https://docs.gollm.co
Apache License 2.0
268 stars 20 forks source link

feat: Set basic inference parameters (top_p, min_p, repeat_penalty, repeat_last_n, seed) #8

Closed sammcj closed 1 month ago

sammcj commented 2 months ago

It would be great if you could set basic interference parameters with gollm.

Temperature by itself is a rather crude way of affecting the generation, instead you can use better algorithms to influence the quality of the output such as:


Ollama has a useful table of the parameters available:

Parameter Description Value Type Example Usage
mirostat Enable Mirostat sampling for controlling perplexity. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0) int mirostat 0
mirostat_eta Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1) float mirostat_eta 0.1
mirostat_tau Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0) float mirostat_tau 5.0
num_ctx Sets the size of the context window used to generate the next token. (Default: 2048) int num_ctx 4096
repeat_last_n Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx) int repeat_last_n 64
repeat_penalty Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1) float repeat_penalty 1.1
temperature The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8) float temperature 0.7
seed Sets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. (Default: 0) int seed 42
stop Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return. Multiple stop patterns may be set by specifying multiple separate stop parameters in a modelfile. string stop "AI assistant:"
tfs_z Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. (default: 1) float tfs_z 1
num_predict Maximum number of tokens to predict when generating text. (Default: 128, -1 = infinite generation, -2 = fill context) int num_predict 42
top_k Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40) int top_k 40
top_p Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9) float top_p 0.9
min_p Alternative to the top_p, and aims to ensure a balance of quality and variety. The parameter p represents the minimum probability for a token to be considered, relative to the probability of the most likely token. For example, with p=0.05 and the most likely token having a probability of 0.9, logits with a value less than 0.045 are filtered out. (Default: 0.0) float min_p 0.05
teilomillet commented 1 month ago

Thank you Sam for the suggestion, I've made another branch to implement those (ollama_config).

Can you check and confirm that this is working on your usage, before I merge the branch ?

I've create an examples/testing_examples.go but I am not sure it emulate how you are using it, neither if it's working.

You should be able to add any params, by using SetXXX inside the NewLLM. You can find the complete list inside the config.go.

Thanks,

teilomillet commented 1 month ago

I've merge them, it should be working.

sammcj commented 1 month ago

Oh gosh sorry I completely missed this!

Thanks so much for adding them, that's great :)