CORE: Settings for model selection

cwang commented 1 year ago

Is your feature request related to a problem? Please describe.

At the system level, enable model selection/switch among the settings.

Needs to consider: should we allow user-level model selection? more specifically, what are the reasons for it?

Describe the solution you'd like

Save it to the same table in the database like the rest of the settings.

Describe alternatives you've considered

No alternative.

Additional context N/A

janaka commented 1 year ago

Decide on env var naming convention for models

cwang commented 1 year ago

Env vars' naming conventions for initialising models:

In general it should follow DOCQ_{VENDOR}_{USECASE}_{OBJECT} in all uppercase, for instance DOCQ_OPENAI_API_KEY.
If sticking to the above format, it should have avoided using any default env var names expected by LlamaIndex or LangChain - it's intentional for being explicit.

For 3rd-party model vendors, available {VENDOR} values include a single vendor string such :

OPENAI
COHERE
ANTHROPIC

For cloud vendors, these {VENDOR} include two parts, {CLOUDVENDOR} and {MODELVENDOR} separated by a _ such as:

AWS_TITAN
GCP_PALM
AZURE_OPENAI
AWS_LLAMA
AWS_FALCON

janaka commented 1 year ago

So for azure Open AI it will be the following, make sense?

DOCQ_AZURE_OPENAI_API_KEY1 DOCQ_AZURE_OPENAI_API_KEY2 DOCQ_AZURE_OPENAI_API_BASE DOCQ_AZURE_OPENAI_CHAT_DEPLOYMENT_NAME or DOCQ_AZURE_OPENAI_CHAT_DEPLOYMENTNAME? DOCQ_AZURE_OPENAI_CHAT_MODEL_NAME DOCQ_AZURE_OPENAI_TEXT_DEPLOYMENT_NAME DOCQ_AZURE_OPENAI_TEXT_MODEL_NAME

Probably a good idea to have the name of the model deployed passed through explicitly.

LangChain also have the following as env vars but I don't think it's needed because neither have an explicit implication on infra resources. So will not pass in.

DOCQ_AZURE_OPENAI_API_TYPE DOCQ_AZURE_OPENAI_API_VERSION

cwang commented 1 year ago

DOCQ_AZURE_OPENAI_API_KEY1 DOCQ_AZURE_OPENAI_API_KEY2 DOCQ_AZURE_OPENAI_API_BASE DOCQ_AZURE_OPENAI_API_VERSION

all good above

Re model name, no need to supply - the same way that with OpenAI, we're select an actual model on the fly within the application code. Re deployment name, it depends on the relationship between models and deployments - if it's 1:1 then there's no need but we need a separate convention internally in the application code to assume same (or similar) model/deployment name. If it's 1 deployment with N models and then yes for deployment name however it should be

DOCQ_AZURE_OPENAI_DEPLOYMENT_NAME

Happy to chat to clarify

janaka commented 1 year ago

Re model name - a deployment is tied to one model. The deployment name (the azure resource name) can be anything but at the moment I'm setting the deployment name = model name. This will work fine unless we need to deploy two instances of the same model like a dev a prod or needing to partition for some other reason. I don't know what these use cases could be.

Re deployment name if we go with DOCQ_AZURE_OPENAI_DEPLOYMENT_NAME are we not going to support multiple models at he same time? I understood that we needed this.

So have gpt-35-turbo and text-davinci-003 models available in parallel so app code can use both?

janaka commented 1 year ago

Also why include DOCQ_AZURE_OPENAI_API_VERSION when there's not infra dependency. From what I can tell you can switch to older OpenAI API version client side with no changes on the infra.

This is separate from model version to be clear. those are set as infra config

cwang commented 1 year ago

Also why include DOCQ_AZURE_OPENAI_API_VERSION when there's not infra dependency. From what I can tell you can switch to older OpenAI API version client side with no changes on the infra.

If this is not attached to the OpenAI service or deployment during creation time, then we can lose it.

If it's 1:1 between deployment and an actual model, then my view is don't set it just to be consistent and let application code to define which model (which in this case implies the deployment name) to use.

janaka commented 1 year ago

@cwang I think you missed answering my second question.

Are we going to support two or more models at the same time from a single provider? I understood that we needed this.

Example: have 'gpt-35-turbo' and 'text-davinci-003' models available in parallel from Azure OpenAI so app functionality can be built that use both?

cwang commented 1 year ago

Yes because it's 1:N between api key and deployments I believe?

janaka commented 1 year ago

OK then we have two options (at least for Azure)

deployment name env var per model (each deployment is a single model, 1:1) some thing like DOCQ_AZURE_OPENAI_TEXT_DEPLOYMENT_NAME and DOCQ_AZURE_OPENAI_CHAT_DEPLOYMENT_NAME. Model name is implied but can use the model name to set the deployment name. This limits us to only having one deployment per model (probably fine).
a single env var for deployments with json that has all the deployment names. A little unconventional but this is nice because it can also explicitly contain the model name. We are only talking about a small json string so env var char limits shouldn't be an issue. E.g:

DOCQ_AZURE_OPENAI_DEPLOYMENT_NAMES = [{'dname':'textdavinchi-dep1', 'mname':'text-davinci-003'}, {'dname':'gpt35turbo-dep1', 'mname':'gpt-35-turbo'}]

dname * - deployment name mname - model name

*keeping names short to save on chars. PaaS's like Azure App Service pass in a load of env vars, and the char limit is shared.

cwang commented 1 year ago

deployment name env var per model (each deployment is a single model, 1:1) some thing like DOCQ_AZURE_OPENAI_TEXT_DEPLOYMENT_NAME and DOCQ_AZURE_OPENAI_CHAT_DEPLOYMENT_NAME. Model name is implied but can use the model name to set the deployment name. This limits us to only having one deployment per model (probably fine).

The answer to this is my previous question about the relationship between a deployment and models belonging to it. Is it 1:1 or 1:N? Can we find out with a definitive answer? Also I believe API keys are at OpenAI service level, therefore it's 1:N between keys and deployments. Can we also confirm that?

I don't think we should worry about deployment name and models - as I suggested earlier, using naming conventions in application code to handle it, e.g. with the assumption that 1 deployment contains 1 model, both named identically. Think about how we provision OpenAI (3rd-party one), again we don't specify actual models in env vars.

In short, env vars should be considered just enough to get everything started in this case. The relaxed approach will leave application (system settings) to dictate what model(s) to be initiated.

janaka commented 1 year ago

deployment:model is 1:1.

~Per the PR #34 , chosen option number 1 from above.~

Change to using the deployement name = model name convention. Therefore deployment name will not be passed in an env var. Only the following three are set for Azure.

DOCQ_AZURE_OPENAI_API_KEY1 DOCQ_AZURE_OPENAI_API_KEY2 DOCQ_AZURE_OPENAI_API_BASE

janaka commented 1 year ago

HuggingFace Inference Client

Ref docs

Client works across HF (free) Inference API or self-hosted Inference Endpoints.

class huggingface_hub.InferenceClient( model: typing.Optional[str] = Nonetoken: typing.Optional[str] = Nonetimeout: typing.Optional[float] = None )

Parameters:

model (str, optional) — The model to run inference with. Can be a model id hosted on the Hugging Face Hub, e.g. bigcode/starcoder or a URL to a deployed Inference Endpoint. Defaults to None, in which case a recommended model is automatically selected for the task. token (str, optional) — Hugging Face token. Will default to the locally saved token. timeout (float, optional) — The maximum number of seconds to wait for a response from the server. Loading a new model in Inference API can take up to several minutes. Defaults to None, meaning it will loop until the server is available.

Initialize a new Inference Client.

InferenceClient aims to provide a unified experience to perform inference. The client can be used seamlessly with either the (free) Inference API or self-hosted Inference Endpoints.

conversational( text: strgenerated_responses: typing.Optional[typing.List[str]] = Nonepast_user_inputs: typing.Optional[typing.List[str]] = Noneparameters: typing.Union[typing.Dict[str, typing.Any], NoneType] = Nonemodel: typing.Optional[str] = None ) → Dict

docqai / docq

CORE: Settings for model selection #12

HuggingFace Inference Client