Quansight / ragna

RAG orchestration framework ⛵️
https://ragna.chat
BSD 3-Clause "New" or "Revised" License
178 stars 22 forks source link

Allow arbitrary component parameters in the web UI #217

Open pmeier opened 10 months ago

pmeier commented 10 months ago

Feature description

One of the core features of Ragna is its extension mechanism that allows users (or third-party projects) to integrate well with it. The flexibility happens on two levels:

  1. The user can select which components to use.
  2. Each component can declare its parameters that can be set through a unified interface.

Point 1. is fully supported by the Python and REST API as well as the web UI. Point 2. however is only properly supported by the Python and REST API. The web UI currently hardcodes the available parameters:

https://github.com/Quansight/ragna/blob/2c1e5c407c4e8afd7f2976021bd5f6551ff57af0/ragna/deploy/_ui/modal_configuration.py#L214-L280

This relies on the fact that almost all builtin source storages and assistants rely on the same set of parameters. However, we already got bitten by it ourselves, see #155. This will not become any easier in the future when we actually go for #191.

My proposal here is to add functionality to the

  1. Python API,
  2. REST API, and
  3. web UI,

to be able to support any custom component without hardcoding it.

Fortunately, we already have most of the things done for 1. and 2.

  1. We check the **kwargs of the ragna.core.Chat construction with pydantic. We are able to do so by dynamically building a model for each function that is part of the Ragna protocol and merging them together for all components that are used for a chat. Right now we only do this on the type. However, we could optionally allow additional metadata the same way that FastAPI and typer do: instead just declaring foo: int we could do foo: typing.Annotated[int, <metadata>]. The <metadata> could just be a pydantic.FieldInfo that can have some additional information like gt (greater than), title, and description.
  2. We are using the information from 1. in the /components endpoint. We return the name of the component as well as the JSON schema from the model we build in 1. Meaning, if we implement 1. as suggested above, the new metadata is automatically included and we don't need to change anything else.

For example:

import pydantic
import pprint

class AssistantAnswer(pydantic.BaseModel):
    max_new_tokens: int = pydantic.Field(
        default=256,
        gt=0,
        title="Maximum new tokens",
        description=(
            "Maximum new tokens to generate. "
            "If you experience truncated answers, increase this value. "
            "However, be aware that longer answers also incur a higher cost."
        ),
    )
{'properties': {'max_new_tokens': {'default': 256,
                                   'description': 'Maximum new tokens to '
                                                  'generate. If you experience '
                                                  'truncated answers, increase '
                                                  'this value. However, be '
                                                  'aware that longer answers '
                                                  'also incur a higher cost.',
                                   'exclusiveMinimum': 0,
                                   'title': 'Maximum new tokens',
                                   'type': 'integer'}},
 'title': 'AssistantAnswer',
 'type': 'object'}

So the real task here is to update the web UI to

  1. be able to to consume the information above, and
  2. be able to present them to the user.

Point 1. should be fairly simple to do. Since we'll always have the type field, we can branch on that and find the relative fields. I don't think pydantic forbids users to put the gt field on string parameters, but we don't need to care. We just need look for a few named fields for each type.

Point 2. could be split into two parts again:

  1. After parsing, how are we going to put the information into an UI element?
  2. How are we going to present and potentially unlimited number of parameters in the UI?

Point 1. should again not be terribly hard. Panels widgets are usually build from param.Parameters, e.g.

https://github.com/Quansight/ragna/blob/2c1e5c407c4e8afd7f2976021bd5f6551ff57af0/ragna/deploy/_ui/modal_configuration.py#L25-L29

So just from this, we can already see that we can set the default as well as the limits. The title and description can also be set. From that, we can just create the actual widget with

https://github.com/Quansight/ragna/blob/2c1e5c407c4e8afd7f2976021bd5f6551ff57af0/ragna/deploy/_ui/modal_configuration.py#L223-L225

Of course we can't use an IntSlider for everything, but since we are already branching on the type, we can set the widget type as well. For example sliders for int and float parameters, a dropdown menu for enums, and so on.

So ultimately after this very long proposal, I think the most important problem that we need to solve is to how to present all the widgets. And while I'm comfortable with all the other points above, I'm not comfortable with this part. I'll add my idea below, but we need a proper design here @smeragoel

My naive idea for this would be to have a tabbed view under the advanced configuration where each tab corresponds to one component type. Meaning, right now we would have two tabs for source storages and assistants. After #191 we would have another tab for the embedding model. Within each tab we would have a matrix of widgets like 2 columns and n number of rows however many we need. IMO, we should not display more than 2-3 rows at once and add a scrollbar after that. That would give us 4-6 parameters visible by default, which is more than enough for our current components.

Value and/or benefit

Without this, the Ragna Python and REST APIs are really flexible and easy to extend for users, but the web UI is not. That was good enough for the first release, but since the web UI is an integral part of Ragna, it should also support the extensions fully.

This came up multiple times so far, e.g. https://github.com/Quansight/ragna/discussions/203#discussioncomment-7575875

Anything else?

No response

pmeier commented 10 months ago

Just found holoviz/panel#1298. It seems although we can provide descriptions for all of our components, panel currently can't display them for everything :disappointed: For example, sliders, which I think would be the right component for int and float parameters, are not supported yet.

Still, it seems the issue in progress. So I guess we provide the functionality on our side anyway and just benefit from it as soon as it is implemented upstream.

pierrotsmnrd commented 10 months ago

My two cents :

Of course we can use an IntSlider for everything

We can define the kind of component to use in the parameter's config : slider by default, or IntInput/FloatInput, or anything else as long as we have a corresponding table parameter's type -> panel widget that can ensure we configure the panel widget properly (boundaries, forbidden values, etc)

My naive idea for this would be to have a tabbed view under the advanced configuration where each tab corresponds to one component type.

To me that looks like a proper idea. Another solution would be to have the advanced config on multiple "screens", having first the document upload + basic config, with 3 buttons : "Cancel", "Start Conversation", and "Advanced config". "Cancel" is obvious, "Start Conversation" starts it with default basic config, and "Advanced config" presents another screen of config. To solve the issue of having potentially an unlimited number of parameters, we can imagine multiple screens in a row.

peachkeel commented 10 months ago

I was going to open a separate issue entitled, "Hard prompt engineering and versioning." However, what I want to say should probably be said here.

I think that an Assistant should present a suggested or default prompt to the end-user such that the end-user can edit or change the suggested prompt before creating a new Chat. This feature, of course, would necessitate text boxes in the UI in which the default or suggested prompt could be displayed for editing.

Turning hard prompts into just another parameter is going to open up a lot more use cases. Already, there are SaaS offerings built around this feature, including PromptLayer.

2023-12-16-100719_1920x1080_scrot