huggingface / chat-ui

Open source codebase powering the HuggingChat app
https://huggingface.co/chat
Apache License 2.0
7.5k stars 1.1k forks source link

trying to replicate the api search with the local search option #571

Closed iChristGit closed 10 months ago

iChristGit commented 11 months ago

When I try searching for information on the site (huggingface.co/chat) it works fine and gives correct information, but when doing the same thing using the same model I get hallucinations.. Ive tried all sorts of temperature settings and models. This is the result locally: image This is with the site: image The sources look the smae on both but the actual response is always not even real information.. This is my current config:

MONGODB_URL=mongodb://localhost:27017 PUBLIC_APP_NAME=PrivateGPT MODELS=[ { "name": "text-generation-webui", "id": "text-generation-webui", "parameters": { "temperature": 0.1, "top_p": 0.95, "repetition_penalty": 1.2, "top_k": 12, "truncate": 1000, "max_new_tokens": 1024, "stop": [] }, "endpoints": [{ "type" : "openai", "baseURL": "http://127.0.0.1:5000/v1/" }] } ]

TypeError [ERR_INVALID_STATE]: Invalid state: Controller is already closed at new NodeError (node:internal/errors:405:5) at ReadableStreamDefaultController.enqueue (node:internal/webstreams/readablestream:1040:13) at update (C:/ChatUI/src/routes/conversation/[id]/+server.ts:155:20) at Object.start (C:/ChatUI/src/routes/conversation/[id]/+server.ts:189:15) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) { code: 'ERR_INVALID_STATE' } TypeError [ERR_INVALID_STATE]: Invalid state: Controller is already closed at new NodeError (node:internal/errors:405:5) at ReadableStreamDefaultController.enqueue (node:internal/webstreams/readablestream:1040:13) at update (C:/ChatUI/src/routes/conversation/[id]/+server.ts:155:20) at Object.start (C:/ChatUI/src/routes/conversation/[id]/+server.ts:189:15) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {

iChristGit commented 11 months ago

What are the settings on the site? so i can try to match the relevant answers?

iChristGit commented 11 months ago

Also, does the search supposed to be gathered and indexed and summarized in like 5-10 seconds? it seems too fast maybe something isnt fully working

nsarrazin commented 11 months ago

The search is quite fast so I think 5-10 seconds is correct. I'll have to investigate deeper to figure out why your results are different. What model are you using with text-generation-webui ?

iChristGit commented 11 months ago

The search is quite fast so I think 5-10 seconds is correct. I'll have to investigate deeper to figure out why your results are different. What model are you using with text-generation-webui ?

Screenshot 2023-11-22 125854

Ive been trying with all of those models, and with all sort of temperature settings. Using the example above, I can see the sources are relevant (openai ceo departure etc) but still the answer is 90% of the times about GPT-3 the "new" model instead of the drama it sees in the sources.

If you have better luck than what shown in my example please tell me what settings you use and what model gives you good success!

Thank you

nsarrazin commented 11 months ago

image Using zephyr with the llama.cpp server locally I have decent results, not sure what is different about text-generation-webui. I'll look into it

iChristGit commented 11 months ago

image Using zephyr with the llama.cpp server locally I have decent results, not sure what is different about text-generation-webui. I'll look into it

Il try installing llamacpp and see if the issue is specific to text generation webui. are those your settings for llamacpp? and can you link the specific model so i can match it ? "temperature": 0.1, "top_p": 0.95, "repetition_penalty": 1.2, "top_k": 50, "truncate": 1000, "max_new_tokens": 2048,

iChristGit commented 11 months ago

Have tried to run llama-cpp and cannot connect it to chat-ui: image Cant seem to run it properly..

iChristGit commented 11 months ago

image Tried text gen webui with llamacpp and zeypher, same results.

iChristGit commented 11 months ago

I am trying everything to run llamacpp and it still fails ( I see in console it listening on port8080, copied to example from the tutorial to my local env file

image

iChristGit commented 11 months ago

I think this has something to do with the results as shown in image, but sometimes it query the right term but still gets it all wrong.. im bashing my head trying all sorts of stuff but it wont work image @nsarrazin

iChristGit commented 11 months ago

So, i managed to run the llama cpp and indeed the results are solid, the search works, so its something specific to text-gen-webui.

Also, in the readme for cpp it says to format it like: MODELS=[ { It didnt work so I wrote instaed of MODELS=`[ { That was the whole issue for me with cpp and now it works.

AlexBlack2202 commented 11 months ago

I think this has something to do with the results as shown in image, but sometimes it query the right term but still gets it all wrong.. im bashing my head trying all sorts of stuff but it wont work image @nsarrazin

how can you enable webSearch, i start it by default and webSearch not appear

iChristGit commented 11 months ago

I think this has something to do with the results as shown in image, but sometimes it query the right term but still gets it all wrong.. im bashing my head trying all sorts of stuff but it wont work image @nsarrazin

how can you enable webSearch, i start it by default and webSearch not appear

put USE_LOCAL_WEBSEARCH=true in your env.local file, its in the readme.

AlexBlack2202 commented 11 months ago

I think this has something to do with the results as shown in image, but sometimes it query the right term but still gets it all wrong.. im bashing my head trying all sorts of stuff but it wont work image @nsarrazin

how can you enable webSearch, i start it by default and webSearch not appear

put USE_LOCAL_WEBSEARCH=true in your env.local file, its in the readme.

thank you, it work for me . Can you share me the name of model you use in llama cpp?

iChristGit commented 11 months ago

I think this has something to do with the results as shown in image, but sometimes it query the right term but still gets it all wrong.. im bashing my head trying all sorts of stuff but it wont work image @nsarrazin

how can you enable webSearch, i start it by default and webSearch not appear

put USE_LOCAL_WEBSEARCH=true in your env.local file, its in the readme.

thank you, it work for me . Can you share me the name of model you use in llama cpp?

MONGODB_URL=mongodb://localhost:27017
USE_LOCAL_WEBSEARCH=true
PUBLIC_APP_ASSETS=chatui
HF_ACCESS_TOKEN=hf_none
PUBLIC_APP_DESCRIPTION="ChatGPT But Open Source!"
PUBLIC_APP_NAME=ChatGPT
MODELS=`[
  {
      "name": "Mythalion-13B",
      "description": "Mythalion is a great overall model",
      "chatPromptTemplate": "{{preprompt}}\nInstruction:{{#each messages}}{{#ifUser}}{{content}}\nResponse:{{/ifUser}}{{#ifAssistant}}{{content}}{{/ifAssistant}}{{/each}}",
       "preprompt": "Below is an instruction that describes a task. Write a response that appropriately completes the request.",
       "promptExamples": [
      {
        "title": "Write an email from bullet list",
        "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
      }, {
        "title": "Code a snake game",
        "prompt": "Code a basic snake game in python and give explanations for each step."
      }, {
        "title": "Assist in a task",
        "prompt": "How do I make a delicious lemon cheesecake?"
      }
      ],
      "parameters": {
        "temperature": 0.1,
        "top_p": 0.95,
        "repetition_penalty": 1.2,
        "top_k": 50,
        "truncate": 3072,
        "max_new_tokens": 4096,
        "stop": ["</s>"]
      },
      "endpoints": [{
         "url": "http://127.0.0.1:8080",
         "type": "llamacpp"
        }
      ]
  }
]`

This is my env.local, its probably not formatted perfectly but works fine. And I use mythomax-l2-13b.Q8_0.gguf, found it very good at search query summarization.

AlexBlack2202 commented 11 months ago

I think this has something to do with the results as shown in image, but sometimes it query the right term but still gets it all wrong.. im bashing my head trying all sorts of stuff but it wont work image @nsarrazin

how can you enable webSearch, i start it by default and webSearch not appear

put USE_LOCAL_WEBSEARCH=true in your env.local file, its in the readme.

thank you, it work for me . Can you share me the name of model you use in llama cpp?

MONGODB_URL=mongodb://localhost:27017
USE_LOCAL_WEBSEARCH=true
PUBLIC_APP_ASSETS=chatui
HF_ACCESS_TOKEN=hf_none
PUBLIC_APP_DESCRIPTION="ChatGPT But Open Source!"
PUBLIC_APP_NAME=ChatGPT
MODELS=`[
  {
      "name": "Mythalion-13B",
      "description": "Mythalion is a great overall model",
      "chatPromptTemplate": "{{preprompt}}\nInstruction:{{#each messages}}{{#ifUser}}{{content}}\nResponse:{{/ifUser}}{{#ifAssistant}}{{content}}{{/ifAssistant}}{{/each}}",
       "preprompt": "Below is an instruction that describes a task. Write a response that appropriately completes the request.",
       "promptExamples": [
      {
        "title": "Write an email from bullet list",
        "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
      }, {
        "title": "Code a snake game",
        "prompt": "Code a basic snake game in python and give explanations for each step."
      }, {
        "title": "Assist in a task",
        "prompt": "How do I make a delicious lemon cheesecake?"
      }
      ],
      "parameters": {
        "temperature": 0.1,
        "top_p": 0.95,
        "repetition_penalty": 1.2,
        "top_k": 50,
        "truncate": 3072,
        "max_new_tokens": 4096,
        "stop": ["</s>"]
      },
      "endpoints": [{
         "url": "http://127.0.0.1:8080",
         "type": "llamacpp"
        }
      ]
  }
]`

This is my env.local, its probably not formatted perfectly but works fine. And I use mythomax-l2-13b.Q8_0.gguf, found it very good at search query summarization.

Thank you, it help me alot

iChristGit commented 11 months ago

Any leads on this? @nsarrazin

aiasophia commented 10 months ago

I get the same with type 'openai'.. Completely ignores search result in response.

nsarrazin commented 10 months ago

Seems like an issue with the openai endpoint type. What server do you use with that endpoint @aiasophia is it also text-generation-webui ?

Would be good to see if using another server of the same openai type works fine, to verifiy if it's an issue with text-generation-webui or all openai endpoints.

aiasophia commented 10 months ago

@nsarrazin Ours is deployed to azure openai, so not run on any local server.

nsarrazin commented 10 months ago

Sounds like an issue with the openAI endpoint then. I'll have a look!

nsarrazin commented 10 months ago

Indeed we were not passing the search results in the openai endpoint type if you used the default chat_completions mode. I just merged #608 to fix the issue, so it should now be working.

@aiasophia @iChristGit @AlexBlack2202 can you tell me if this now works for you when using the openai endpoint on latest main? It worked for me using gpt-3.5-turbo but it'd be good to confirm with other setups :smile:

aiasophia commented 10 months ago

@nsarrazin Thank you, it works for us :)

nsarrazin commented 10 months ago

Closing this issue for now, if it doesn't work let me know and I'll reopen it!

iChristGit commented 10 months ago

Yeah It does work! now I can use Text generation webui correctly! thank you image

iChristGit commented 10 months ago

Closing this issue for now, if it doesn't work let me know and I'll reopen it!

A quick question, I am trying to run this model: https://huggingface.co/TheBloke/dolphin-2_2-yi-34b-GPTQ As I understand, the Instruction template is set inside the Text-Generation-webui, but the model keeps spitting information, like it cant stop until it gets to max tokens. I used the Chat-ML template in the webui, what should i set inside .env.local instead of "stop": [] ?

nsarrazin commented 10 months ago

Maybe "stop":["<|im_end|>"] ?

iChristGit commented 10 months ago

Maybe "stop":["<|im_end|>"] ?

I have tried it but still getting very very long responses, I do want to keep the max new tokens at around 2048 but have it stop when needed.

iChristGit commented 10 months ago

Maybe "stop":["<|im_end|>"] ?

this is an example:

(Verse 1)
Oh, Mcdonalds, where dreams come true,
A place to unwind, a place to renew.
From burgers to fries, to the sundaes divine,
Every taste buds's tune we'll hum in time.

(Chorus)
Mcdonalds, Mcdonalds, you're our delight,
A golden arches's shining light.
The taste of happiness, a treasure in disguise,
In every bite, your love we prize.

(Verse 2)
Through winter or summer, through night or day,
Mcdonalds's magic, we can't delay.
Nachos grande, Big Mac's fame,
Quenching cravings, it's a game name.

(Chorus)
Mcdonalds, Mcdonalds, you're our delight,
A golden arches's shining light.
The taste of happiness, a treasure in disguise,
In every bite, your love we prize.

(Bridge)
Oh, we dance, we sing, we shout and roar,
For Mcdonalds, we forevermore.
Through joy and sorrow, through wind and rain,
Our love for Mcdonalds never wanes.

(Chorus)
Mcdonalds, Mcdonalds, you're our delight,
A golden arches's shining light.
The taste of happiness, a treasure in disguise,
In every bite, your love we prize.

(Outro)
So here's to Mcdonalds, our song of praise,
For love and unity, we raise.
A beacon of hope, a dream come true,
Forever and always, we'll sing of you.

(Ending Verse)
Oh Mcdonalds, our hearts beat in sync,
Your love we feel, your magic we sink.
For every taste bud, let's raise a cheer,
To Mcdonalds, our love we'll hold dear.

(Outro Chorus)
Mcdonalds, Mcdonalds, you're our delight,
A golden arches's shining light.
The taste of happiness, a treasure in disguise,
In every bite, your love we prize.

(Outro Chorus)
Mcdonalds, Mcdonalds, you're our delight,
A golden arches's shining light.
The taste of happiness, a treasure in disguise,
In every bite, your love we prize.

(Ending Verse)
And as we part, our hearts will sing,
For Mcdonalds, our song we'll bring.
For every taste bud, let's raise a cheer,
To Mcdonalds, our love we'll hold dear.

(Outro Chorus)
Mcdonalds, Mcdonalds, you're our delight,
A golden arches's shining light.
The taste of happiness, a treasure in disguise,
In every bite, your love we prize.

(Outro Chorus)
Mcdonalds, Mcdonalds, you're our delight,
A golden arches's shining light.
The taste of happiness, a treasure in disguise,
In every bite, your love we prize.

(Ending Verse)
For Mcdonalds, we'll compose a tune,
Our hearts beat fast, our love runs deep.
For every taste bud, let's raise a cheer,
To Mcdonalds, our love we'll hold dear.

(Outro Chorus)
Mcdonalds, Mcdonalds, you're our delight,
A golden arches's shining light.
The taste of happiness, a treasure in disguise,
In every bite, your love we prize.

(Outro Chorus)
Mcdonalds, Mcdonalds, you're our delight,
A golden arches's shining light.
The taste of happiness, a treasure in disguise,
In every bite, your love we prize.

(Ending Verse)
And so we sing, our song of praise,
For Mcdonalds, our love we raise.
For every taste bud, let's raise a cheer,
To Mcdonalds, our love we'll hold dear.

(Outro Chorus)
Mcdonalds, Mcdonalds, you're our delight,
A golden arches's shining light.
The taste of happiness, a treasure in disguise,
In every bite, your love we prize

I use this local.venv currently: I have tried higher repetition penalty but it just switch up the words and keeps going

MONGODB_URL=mongodb://localhost:27017
USE_LOCAL_WEBSEARCH=true
PUBLIC_APP_ASSETS=chatui
HF_ACCESS_TOKEN=hf_none
PUBLIC_APP_DESCRIPTION="ChatGPT But Open Source!"
PUBLIC_APP_NAME=ChatGPT
MODELS=`[
  {
      "name": "Mythalion-13B",
      "description": "Mythalion is a great overall model",
      "chatPromptTemplate": "{{preprompt}}\nInstruction:{{#each messages}}{{#ifUser}}{{content}}\nResponse:{{/ifUser}}{{#ifAssistant}}{{content}}{{/ifAssistant}}{{/each}}",
       "preprompt": "Below is an instruction that describes a task. Write a response that appropriately completes the request.",
       "promptExamples": [
      {
        "title": "Write an email from bullet list",
        "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
      }, {
        "title": "Code a snake game",
        "prompt": "Code a basic snake game in python and give explanations for each step."
      }, {
        "title": "Assist in a task",
        "prompt": "How do I make a delicious lemon cheesecake?"
      }
      ],
      "parameters": {
        "temperature": 0.7,
        "top_p": 0.95,
        "repetition_penalty": 1.2,
        "top_k": 50,
        "truncate": 1000,
        "max_new_tokens": 1024,
        "stop": ["<|im_end|>"]
    },
    "endpoints": [{
      "type" : "openai",
      "baseURL": "http://127.0.0.1:5000/v1"
    }]
  }
]`