fmaclen / hollama

A minimal web-UI for talking to Ollama (and OpenAI) servers
https://hollama.fernando.is
MIT License
504 stars 42 forks source link

Add support for all Ollama /api/generate parameters #72

Open raychz opened 4 months ago

raychz commented 4 months ago

Extended Parameter Support for Ollama API in Hollama Interface

The Hollama interface currently supports a limited set of parameters when making requests to the Ollama generate API. The current completion request payload looks like this:

let payload: OllamaCompletionRequest = {
    model: session.model,
    context: session.context,
    prompt: session.messages[session.messages.length - 1].content
};

Request

We would like to extend the Hollama interface to support more of the parameters available in the Ollama generate API. This would provide users with greater control over their interactions with the models.

Proposed Parameters to Add

Based on the Ollama API documentation, I suggest adding support for the following parameters:

  1. images: For multimodal models
  2. format: To specify the response format (e.g., JSON)
  3. options: For additional model parameters (e.g., temperature)
  4. system: To override the system message defined in the Modelfile
  5. template: To override the prompt template defined in the Modelfile
  6. stream: To control whether the response is streamed or returned as a single object
  7. raw: To allow specifying a full templated prompt without formatting
  8. keep_alive: To control how long the model stays loaded into memory

Implementation Considerations

Questions

Ollama Params

fmaclen commented 4 months ago

I definitely would like to expand on the capabilities the UI can do with Ollama. For example, I'd love to support uploading images for multimodal models, pull models from the UI, allow concurrency, etc...

Conceptually, the main idea behind Hollama is to have "really good defaults" so a new user with little/no experience can get a good experience without having to configure anything or have to learn technical jargon. Simply download Ollama+Hollama and the UI guides you through the rest of the process with the least amount of steps as possible.

For contrast, when I look at LMStudio I find it overwhelming, there's a steep learning curve and many decisions I have to make before I can get anything productive out of it.

That being said, what I'd like to understand better is if/how we support the rest of the API. I can see use cases for allowing you to tweak the system/prompt templates, or change keep_alive in some "Advanced settings" menu.

But how useful is to have JSON output in the Sessions view? Or why would you disable stream?

@raychz are these settings that you personally find yourself editing: once? often? always? Would really appreciate it if you can share some examples of your use-cases.