simonw / llm-gpt4all

Plugin for LLM adding support for the GPT4All collection of models
Apache License 2.0
213 stars 19 forks source link

feat: Add cli options for generation #17

Closed RangerMauve closed 8 months ago

RangerMauve commented 11 months ago

Refs:

Per our discussion on the fediverse.

I took the parameter descriptions from the gpt4all python docs and copied the structure from the llm-llama-cpp plugin

Does this work? Any other changes that I should make?

simonw commented 11 months ago

Test are failing with NameError: name 'Field' is not defined - I think there's a missing import.

RangerMauve commented 11 months ago

Ty, added the imports from llama-cpp.

RangerMauve commented 10 months ago

Tested out with llm install ./ which worked out. The options seem to be getting passed in.

RangerMauve commented 10 months ago

Tweaking params didn't make the replit model perform well enough to be useful to me sadly :P

simonw commented 8 months ago

With this PR checked out, llm models --options included:

gpt4all: all-MiniLM-L6-v2-f16 - SBert, 43.76MB download, needs 1GB RAM (installed)
  max_tokens: int
    The maximum number of tokens to generate.
  temp: float
    The model temperature. Larger values increase creativity but decrease
    factuality.
  top_k: int
    Randomly sample from the top_k most likely tokens at each generation
    step. Set this to 1 for greedy decoding.
  top_p: float
    Randomly sample at each generation step from the top most likely
    tokens whose probabilities add up to top_p.
  repeat_penalty: float
    Penalize the model for repetition. Higher values result in less
    repetition.
  repeat_last_n: int
    How far in the models generation history to apply the repeat penalty.
  n_batch: int
    Number of prompt tokens processed in parallel. Larger values decrease
    latency but increase resource requirements.
gpt4all: orca-mini-3b-gguf2-q4_0 - Mini Orca (Small), 1.84GB download, needs 4GB RAM (installed)
  max_tokens: int
  temp: float
  top_k: int
  top_p: float
  repeat_penalty: float
  repeat_last_n: int
  n_batch: int
gpt4all: mistral-7b-instruct-v0 - Mistral Instruct, 3.83GB download, needs 8GB RAM (installed)
  max_tokens: int
  temp: float
  top_k: int
  top_p: float
  repeat_penalty: float
  repeat_last_n: int
  n_batch: int
simonw commented 8 months ago
$ llm -m mistral-7b-instruct-v0 'hello' -o max_tokens 2
 Hello!
$ llm -m mistral-7b-instruct-v0 'hello' -o max_tokens 10
 Hello! How can I help you today?
simonw commented 8 months ago

Documentation preview: https://github.com/simonw/llm-gpt4all/blob/624b75bbb3d9736e00c931d4de3eef2de80c5e72/README.md#model-options