huggingface / chat-ui

Open source codebase powering the HuggingChat app
https://huggingface.co/chat
Apache License 2.0
7.56k stars 1.11k forks source link

WebSearch uses the default model instead of current model selected #823

Open ihubanov opened 8 months ago

ihubanov commented 8 months ago

I have multiple models in my .env.local and it seems the WebSearch uses the default model to perform its search content extraction instead of the currently selected model (the one that I'm asking the question to...) Is it possible to add a config option to use same model for everything?

nsarrazin commented 8 months ago

You can set TASK_MODEL to be any model name you like in your .env, but from what I understand you would like the websearch to use the current conversation model for all tasks is that right ?

ihubanov commented 8 months ago

Yes, because it seems there are some issues when detecting the stop sequence and when I'm using llama model for chat and it uses Mistral for tasks llama fails to match the stop sequence and starts talking to itself... 😆 Not sure if the case is that it is using different models, but would be nice if it is possible to set it so it use the same model for both...

nsarrazin commented 8 months ago

Well there shouldn't be issues with stop sequences 👀 , can you share your MODELS var while making sure to remove any secrets from there?

But regarding the websearch maybe if TASK_MODEL is not set, it should default to using the conversation model instead of the first model on the list. That should be an easy fix.

ihubanov commented 8 months ago
MODELS=`
[
    {
        "name": "mistralai/Mistral-7B-Instruct-v0.2",
        "displayName": "mistralai/Mistral-7B-Instruct-v0.2",
        "description": "Mistral 7B is a new Apache 2.0 model, released by Mistral AI that outperforms Llama2 13B in benchmarks.",
        "websiteUrl": "https://mistral.ai/news/announcing-mistral-7b/",
        "preprompt": "",
        "chatPromptTemplate": "<s>{{#each messages}}{{#ifUser}}[INST] {{#if @first}}{{#if @root.preprompt}}{{@root.preprompt}}\n{{/if}}{{/if}}{{content}} [/INST]{{/ifUser}}{{#ifAssistant}}{{content}}</s>{{/ifAssistant}}{{/each}}",
        "parameters": {
            "temperature": 0.3,
            "top_p": 0.95,
            "repetition_penalty": 1.2,
            "top_k": 50,
            "truncate": 3072,
            "max_new_tokens": 1024,
            "stop": [
                "</s>"
            ]
        },
        "promptExamples": [
            {
                "title": "Write an email from bullet list",
                "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
            },
            {
                "title": "Code a snake game",
                "prompt": "Code a basic snake game in python, give explanations for each step."
            },
            {
                "title": "Assist in a task",
                "prompt": "How do I make a delicious lemon cheesecake?"
            }
        ],
        "endpoints": [
            {
                "type": "openai",
                "baseURL": "http://localhost:5000/v1"
            }
        ]
    },
    {
        "name": "togethercomputer/Llama-2-7B-32K-Instruct",
        "description": "The 32K model from togethercomputer, fine-tuned for chat.",
        "websiteUrl": "https://huggingface.co/togethercomputer/Llama-2-7B-32K-Instruct",
        "userMessageToken": "",
        "userMessageEndToken": " [/INST] ",
        "assistantMessageToken": "",
        "assistantMessageEndToken": " </s><s>[INST] ",
        "reprompt": " ",
        "chatPromptTemplate": "<s>[INST] <<SYS>>\n{{preprompt}}\n<</SYS>>\n\n{{#each messages}}{{#ifUser}}{{content}} [/INST] {{/ifUser}}{{#ifAssistant}}{{content}} </s><s>[INST] {{/ifAssistant}}{{/each}}",
        "promptExamples": [
            {
                "title": "Write an email from bullet list",
                "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
            },
            {
                "title": "Code a snake game",
                "prompt": "Code a basic snake game in python, give explanations for each step."
            },
            {
                "title": "Assist in a task",
                "prompt": "How do I make a delicious lemon cheesecake?"
            }
        ],
        "parameters": {
            "temperature": 0.1,
            "top_p": 0.95,
            "repetition_penalty": 1.2,
            "top_k": 50,
            "truncate": 32768,
            "max_new_tokens": 4096,
            "stop": [
                " [INST] ",
                " [/INST] "
            ]
        },
        "endpoints": [
            {
                "type": "openai",
                "baseURL": "http://localhost:5001/v1"
            }
        ]
    }
]
`

Not sure if there's a better way to allow OpenAI API calls to more than one model so I'm loading two instances of 'text-generation-webui' with different --api-port options...