Add parameters table to prompt

David-Kunz / gen.nvim

Neovim plugin to generate text using LLMs with customizable prompts

The Unlicense

992 stars 64 forks source link

Add parameters table to prompt #6

Closed JoseConseco closed 9 months ago

JoseConseco commented 9 months ago

In some cases it would be good to tweak model parameters

eg. for summary low temperature is suggested - so that model does not hallucinate as much.
Depending on user gpu, we could also set number of gpu layers used.
some models are not correctly configured - eg. mistral supports context of 8k, but ollama does not set the context so I think its is using just default 2k, etc Ollama does seem support this parameters feature - https://github.com/jmorganca/ollama/blob/main/docs/api.md#parameters . My feature request, add additional parameters input to prompt.

And one question (since there is not discussion section) I'm just slightly confused about how ollama works :

gen.nvim starts 'ollama serve' - no model is yest loaded in memory right?
users sends query to ollama, ollama loads the model, sends response, and but modes stays loaded in memory ?
is we send request with different gpu-layers parameter - it would unload previews model, and load new one ?

David-Kunz commented 9 months ago

Hi @JoseConseco ,

Thank you for this suggestion!

It looks like the ollama run command doesn't allow additional parameters, at least according to the help:

ollama run --help
Run a model

Usage:
  ollama run MODEL [PROMPT] [flags]

Flags:
  -h, --help         help for run
      --insecure     Use an insecure registry
      --nowordwrap   Don't wrap words to the next line automatically
      --verbose      Show timings for response

But according to the Ollama documentation you can configure a custom model with custom parameters.

Would that work for you?

Thank you and best regards, David

JoseConseco commented 9 months ago

Yes, I already tried. But while baking params into model is cool, it is not quite flexible:

for summary you would want low temp (lowers probability of hallucinations) ,
when asking eg .to fix grammar, u would probably want highter temp - to get more creative response... Thus it would be awesome if it could be configured per prompt preset. It seems parameters only work if using curls for quering server. Not big issue, still would be cool to have more control.

David-Kunz commented 9 months ago

You could create custom models and invoke your prompt with:

local gen = require('gen')
local prompt = gen.prompts['Your_Prompt']
gen.exec(vim.tbl_deep_extend('force', prompt, { model = 'custom_model' }))

Then you could have several models Mistral_Low_Temp, Mistral, ....

It's not a great solution, but since ollama doesn't support run with parameters, I think that's the only way to achieve this (other than sending it via HTTP).

What do you think?

Best regards, David

JoseConseco commented 9 months ago

Yes, it will work ok Thanks!. Btw. is there any particular reason why you did go with curls and REST api? Seems to be working fine :
And if ollama is running, and yousend culrs request - then it will automatically start Server with requested model, and send response.

It seems to have only advantages: you can query servere about model info, debug info, model creation etc.

JoseConseco commented 9 months ago

I opened https://github.com/David-Kunz/gen.nvim/pull/7 that adds optional model parameter, per prompt. After all gen.exec() - did not work - since, it seems prompt parsing, is done before exec(), thus $text field was always empty.

David-Kunz commented 9 months ago

Thank you for your suggestion, @JoseConseco .

It seems to have only advantages: you can query servere about model info, debug info, model creation etc.

The disadvantage is that I'd need to send HTTP requests which adds curl (or something else?) as another dependency. But I can think about it!