How to use RAG with Hugging Face transformer like BERT or T5 as those are also LLM

ramda1234786 commented 9 months ago

How can I build a RAG with a model other than Open AI, Cohere and Sagemaker. Can i use hugging face transformer or BERT model for predicting sentences? without having any hugging face key

i am using version 2.10

If yes how to build a http connector for it? or how can i load the model

I tried this below one.

Can i upload model which seq2se2 and does not require http connector of any keys. Basically i do not want to use the API keys or but the product until i am sure this solutions works out

POST /_plugins/_ml/models/_upload

{
  "name": "huggingface/TheBloke/vicuna-13B-1.1-GPTQ",
  "version": "1.0.1",
  "model_format": "TORCH_SCRIPT"
}

HenryL27 commented 9 months ago

That's a good question, and I'm not sure I have an answer for you. I'm of the opinion that the resources for running OpenSearch and for running LLMs should be separated - also I don't think that ML-Commons supports running seq2seq natively. So I'm pretty sure you need to host the model externally and connect via the Connector interface. It might be worth checking out something like torchserve or other open source model hosting places?

As far as building connectors goes, it looks like basically what you need to do is tell opensearch how to construct the http request. So to hit the basic example from the torchserve readme I would expect to be able to do something like

POST /_plugins/_ml/connectors/_create
{
    "name": "TorchServe bert connector",
    "description": "The connector to a locally hosted bert model",
    "version": 1,
    "protocol": "http",
    "parameters": {
        "endpoint": "127.0.0.1:8080",
        "model": "bert"
    },
    "actions": [
        {
            "action_type": "predict",
            "method": "GET",
            "url": "http://${parameters.endpoint}/predictions/${parameters.model}",
            "request_body": "${parameters.input}"
        }
    ]
}

I haven't tested any of this and I don't think there has been much testing in general of connectors other than OpenAI, Cohere, Sagemaker.

Hope this helps!

ramda1234786 commented 9 months ago

yes above one helped me to create the connector and i have deployed the model as well. Thanks for the guidance But currently i am stuck with the below issue, i am seeking help from forum of ml

https://forum.opensearch.org/t/not-able-to-run-the-predict-apis-with-external-ml-models-like-hugging-face/16237

aryn-ai / conversational-opensearch

How to use RAG with Hugging Face transformer like BERT or T5 as those are also LLM #8