huggingface / chat-ui

Open source codebase powering the HuggingChat app
https://huggingface.co/chat
Apache License 2.0
7.17k stars 1.03k forks source link

Local Deployment Auth Issue #330

Closed toby-lm closed 4 months ago

toby-lm commented 1 year ago

Hi, I've experienced a lot of issues trying to get this running locally with docker compose, using a Text Generation Inference API + Model and a mongoDB container.

Errors

Initially I experience issue #252 where the modal cannot be dismissed. When I hardcode ethicsModalAcceptedAt to dismiss this, I would get a 500 Error on the /settings page that redirects after accepting the modal. I could navigate to the chat homepage though, but starting a new chat gave me a 403 Forbidden error.

After some debugging in the source code, I added this line into the Dockerfile (just before the rpm run build command) and I was able to get conversations to work:

RUN find /app/src/routes/conversation/ -type f | xargs sed -i 's/...authCondition(locals),/\/\/ ...authCondition(locals),/g'

Disabling the PUBLIC_APP_DISCLAIMER then let me access the UI without any visible errors (this was not possible before the sed command).

I'm not familiar enough with Svelte & Typescript to figure out an actual solution for this but this is what I've found. I've confirmed MongoDB access is working and this issue is replicated across Safari, Firefox & Chrome. I'm accessing the chat UI from macOS and it is running on a Linux machine on my local network.

During conversations, the following error will appear in the logs (seems similar to #274) and conversations will not be saved or have any name (but this might be because I've commented out any authCondition call):

04:48:31 0|index  | InferenceOutputError: Invalid inference output: Expected Array<{generated_text: string}>. Use the 'request' method with the same parameters to do a custom call with no type checking.
04:48:31 0|index  |     at textGeneration (file:///app/node_modules/@huggingface/inference/dist/index.mjs:460:11)
04:48:31 0|index  |     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
04:48:31 0|index  |     at async generateFromDefaultEndpoint (file:///app/build/server/chunks/generateFromDefaultEndpoint-22b35e73.js:13:28)
04:48:31 0|index  |     at async POST (file:///app/build/server/chunks/_server.ts-d70abb06.js:27:26)
04:48:31 0|index  |     at async render_endpoint (file:///app/build/server/index.js:1418:22)
04:48:31 0|index  |     at async resolve (file:///app/build/server/index.js:3817:22)
04:48:31 0|index  |     at async Object.handle (file:///app/build/server/chunks/hooks.server-d972d25a.js:63:20)
04:48:31 0|index  |     at async respond (file:///app/build/server/index.js:3710:22)
04:48:31 0|index  |     at async Array.ssr (file:///app/build/handler.js:1207:3)

Setup

I hope that's an understandable overview of my issues. Here is a minimum reproducible version of my setup:

docker-compose.yml

version: '3.x'
services:
  mongodb:
    image: mongo:latest
    container_name: mongodb
    volumes:
      - ./mongo:/data/db
    ports:
      - "27017:27017"
  chatui:
    build: 
      context: ./chat-ui
      dockerfile: Dockerfile
    container_name: chatui
    ports:
      - "3000:3000"
    depends_on:
      - mongodb
  falcon:
    image: ghcr.io/huggingface/text-generation-inference:0.8
    deploy:
      resources:
        reservations:
          devices:
          - driver: nvidia
            device_ids: ['0', '1']
            capabilities: [gpu]
    shm_size: '1g'
    container_name: falcon
    volumes:
      - ./data:/data
    ports:
      - "8080:80"
    command: "--model-id tiiuae/falcon-40b --num-shard 2"

.env.local

MONGODB_URL=mongodb://mongodb:27017/
MONGODB_DB_NAME=chat
MONGODB_DIRECT_CONNECTION=false

COOKIE_NAME=hf-chat
HF_ACCESS_TOKEN=hf_XXXXXXXXXXXXXXXX

# used to activate search with web functionality. disabled if not defined
SERPAPI_KEY=#your serpapi key here

# Parameters to enable "Sign in with HF"
OPENID_CLIENT_ID=
OPENID_CLIENT_SECRET=
OPENID_SCOPES= # Add "email" for some providers like Google that do not provide preferred_username
OPENID_PROVIDER_URL= #https://huggingface.co # for Google, use https://accounts.google.com

# 'name', 'userMessageToken', 'assistantMessageToken' are required
MODELS=`[
  {
    "name": "tiiuae/falcon-40b",
    "userMessageToken": "<|prompter|>",
    "assistantMessageToken": "<|assistant|>",
    "messageEndToken": "</s>",
    "preprompt": "Below are a series of dialogues between various people and an AI assistant. The AI tries to be helpful, polite, honest, sophisticated, emotionally aware, and humble-but-knowledgeable. The assistant is happy to help with almost anything, and will do its best to understand exactly what is needed. It also tries to avoid giving false or misleading information, and it caveats when it isn't entirely sure about the right answer. That said, the assistant is practical and really does its best, and doesn't let caution get too much in the way of being useful.\n-----\n",
    "promptExamples": [
      {
        "title": "Write an email from bullet list",
        "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
      }, {
        "title": "Code a snake game",
        "prompt": "Code a basic snake game in python, give explanations for each step."
      }, {
        "title": "Assist in a task",
        "prompt": "How do I make a delicious lemon cheesecake?"
      }
    ],
    "parameters": {
      "temperature": 0.9,
      "top_p": 0.95,
      "repetition_penalty": 1.2,
      "top_k": 50,
      "truncate": 1000,
      "max_new_tokens": 1024
    },
    "endpoints": [
        {
            "url": "http://falcon:80/generate_stream"
        }
    ]
  }
]`
OLD_MODELS=`[]`# any removed models, `{ name: string, displayName?: string, id?: string }`

PUBLIC_ORIGIN=http://192.168.1.10:3000 #https://huggingface.co

PUBLIC_SHARE_PREFIX= #https://hf.co/chat
PUBLIC_GOOGLE_ANALYTICS_ID= #G-XXXXXXXX / Leave empty to disable
PUBLIC_DEPRECATED_GOOGLE_ANALYTICS_ID= #UA-XXXXXXXX-X / Leave empty to disable
PUBLIC_ANNOUNCEMENT_BANNERS=
PARQUET_EXPORT_DATASET=
PARQUET_EXPORT_HF_TOKEN=
PARQUET_EXPORT_SECRET=

PUBLIC_APP_NAME=ChatUI # name used as title throughout the app
PUBLIC_APP_ASSETS=chatui # used to find logos & favicons in static/$PUBLIC_APP_ASSETS
PUBLIC_APP_COLOR=blue # can be any of tailwind colors: https://tailwindcss.com/docs/customizing-colors#default-color-palette
PUBLIC_APP_DATA_SHARING= #set to 1 to enable options & text regarding data sharing
PUBLIC_APP_DISCLAIMER= #

The Dockerfile for chat-ui is unchanged apart from the additional line to run the sed command explained above.

If you have a working example of a fully local pipeline with text-generation-inference and mongoDB, preferably in Docker, that would also be super useful. Let me know what other information you need to debug or reproduce this.

bodaay commented 1 year ago

did all in one, dockerfile and runpod both available:

https://github.com/bodaay/HuggingChatAllInOne

MartinGleize commented 1 year ago

did all in one, dockerfile and runpod both available:

https://github.com/bodaay/HuggingChatAllInOne

Thanks! I can confirm applying the patches you added in your repo fixed my issues in a remote deployment setting.

nsarrazin commented 11 months ago

If you have a fix we would happily review a PR for it!

mbuet2ner commented 11 months ago

Maybe this helps someone: I recently wrote a blog post about self-hosting ChatUI + TGI via docker-compose– with SSL via a nginx reverse proxy. The files are available from here 🙂.