SciPhi-AI / R2R

The Elasticsearch for RAG. Build, scale, and deploy state of the art Retrieval-Augmented Generation applications
https://r2r-docs.sciphi.ai/
MIT License
3.25k stars 238 forks source link

Supporting multiple generation_configs in the config #813

Open underlines opened 1 month ago

underlines commented 1 month ago

Is your feature request related to a problem? Please describe. It isn't a problem, but the R2R-Dashboard provides a drop-down for model selection, at the same time it seems I can't define multiple completions.generation_config.

Describe the solution you'd like I tried providing an array of generation_config:

{
  "completions": {
    "provider": "litellm",
    "generation_config": [
      {
        "model": "ollama/qwen2:1.5b",
        "temperature": 0.1,
        "top_p": 1.0,
        "max_tokens_to_sample": 1024,
        "stream": false,
        "tools": null,
        "add_generation_kwargs": {},
        "api_base": null
      },
      {
        "model": "ollama/phi3:3.8b",
        "temperature": 0.2,
        "top_p": 0.8,
        "max_tokens_to_sample": 512,
        "stream": true,
        "tools": null,
        "add_generation_kwargs": {},
        "api_base": null
      },
...

Which r2r doesn't currently support but could easily be refactored in the current codebase it seems, and exposing the choice to the R2R-Dashboard Playground in the dropdown.

Describe alternatives you've considered Switching models by booting up different configurations or using the CLI, which is cumbersome.

Additional context None

emrgnt-cmplxty commented 1 month ago

You can choose different models at runtime, these settings are what the server will default to in absence of user input.

We've made this clearer in the documentation here - https://r2r-docs.sciphi.ai/cookbooks/basic-configuration#llm-provider-configuration

You should be able to confirm that changing the selected model in the playground does in fact result in that model generating the completion.