lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Apache License 2.0
36.45k stars 4.49k forks source link

Add Gradio web server to Docker Compose #1541

Open MilesQLi opened 1 year ago

darribas commented 1 year ago

I'd love to see this too!

I'm playing with the docker-compose.yml to expand it with:

fastchat-gradio-server:
      build:
        context: .
        dockerfile: Dockerfile
      environment:
        FASTCHAT_CONTROLLER_URL: http://fastchat-controller:21001
      image: fastchat:latest
      depends_on:
        fastchat-controller:
          condition: service_started
        fastchat-model-worker:
          condition: service_started
        fastchat-api-server:
          condition: service_started
      ports:
        - "8001:8001"
      entrypoint: ["python3", "-m", "fastchat.serve.gradio_web_server", "--host", "0.0.0.0", "--port", "8001"]

But I hit connection errors and the service exits, see logs below:

Server log ```python fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | INFO | gradio_web_server | args: Namespace(add_bard=False, add_chatgpt=False, add_claude=False, concurrency_count=10, controller_url='http://localhost:21001', host='0.0.0.0', model_list_mode='once', moderate=False, port=8001, share=False) fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | ╭───────────────────── Traceback (most recent call last) ──────────────────────╮ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/local/lib/python3.8/dist-packages/urllib3/connection.py:200 in │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ _new_conn │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 197 │ │ :return: New socket connection. │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 198 │ │ """ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 199 │ │ try: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 200 │ │ │ sock = connection.create_connection( │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 201 │ │ │ │ (self._dns_host, self.port), │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 202 │ │ │ │ self.timeout, │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 203 │ │ │ │ source_address=self.source_address, │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/local/lib/python3.8/dist-packages/urllib3/util/connection.py:85 in │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ create_connection │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 82 │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 83 │ if err is not None: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 84 │ │ try: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 85 │ │ │ raise err │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 86 │ │ finally: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 87 │ │ │ # Break explicitly a reference cycle │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 88 │ │ │ err = None │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/local/lib/python3.8/dist-packages/urllib3/util/connection.py:73 in │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ create_connection │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 70 │ │ │ │ sock.settimeout(timeout) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 71 │ │ │ if source_address: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 72 │ │ │ │ sock.bind(source_address) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 73 │ │ │ sock.connect(sa) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 74 │ │ │ # Break explicitly a reference cycle │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 75 │ │ │ err = None │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 76 │ │ │ return sock │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | ╰──────────────────────────────────────────────────────────────────────────────╯ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | ConnectionRefusedError: [Errno 111] Connection refused fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | The above exception was the direct cause of the following exception: fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | ╭───────────────────── Traceback (most recent call last) ──────────────────────╮ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py:790 in │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ urlopen │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 787 │ │ │ response_conn = conn if not release_conn else None │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 788 │ │ │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 789 │ │ │ # Make the request on the HTTPConnection object │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 790 │ │ │ response = self._make_request( │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 791 │ │ │ │ conn, │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 792 │ │ │ │ method, │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 793 │ │ │ │ url, │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py:496 in │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ _make_request │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 493 │ │ # conn.request() calls http.client.*.request, not the method │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 494 │ │ # urllib3.request. It also calls makefile (recv) on the socke │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 495 │ │ try: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 496 │ │ │ conn.request( │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 497 │ │ │ │ method, │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 498 │ │ │ │ url, │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 499 │ │ │ │ body=body, │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/local/lib/python3.8/dist-packages/urllib3/connection.py:388 in request │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 385 │ │ │ self.putheader("User-Agent", _get_default_user_agent()) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 386 │ │ for header, value in headers.items(): │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 387 │ │ │ self.putheader(header, value) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 388 │ │ self.endheaders() │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 389 │ │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 390 │ │ # If we're given a body we start sending that in chunks. │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 391 │ │ if chunks is not None: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/lib/python3.8/http/client.py:1251 in endheaders │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 1248 │ │ │ self.__state = _CS_REQ_SENT │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 1249 │ │ else: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 1250 │ │ │ raise CannotSendHeader() │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 1251 │ │ self._send_output(message_body, encode_chunked=encode_chunked │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 1252 │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 1253 │ def request(self, method, url, body=None, headers={}, *, │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 1254 │ │ │ │ encode_chunked=False): │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/lib/python3.8/http/client.py:1011 in _send_output │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 1008 │ │ self._buffer.extend((b"", b"")) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 1009 │ │ msg = b"\r\n".join(self._buffer) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 1010 │ │ del self._buffer[:] │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 1011 │ │ self.send(msg) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 1012 │ │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 1013 │ │ if message_body is not None: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 1014 │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/lib/python3.8/http/client.py:951 in send │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 948 │ │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 949 │ │ if self.sock is None: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 950 │ │ │ if self.auto_open: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 951 │ │ │ │ self.connect() │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 952 │ │ │ else: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 953 │ │ │ │ raise NotConnected() │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 954 │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/local/lib/python3.8/dist-packages/urllib3/connection.py:236 in connect │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 233 │ │ self._tunnel_scheme = scheme │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 234 │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 235 │ def connect(self) -> None: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 236 │ │ self.sock = self._new_conn() │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 237 │ │ if self._tunnel_host: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 238 │ │ │ # If we're tunneling it means we're connected to our proxy │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 239 │ │ │ self._has_connected_to_proxy = True │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/local/lib/python3.8/dist-packages/urllib3/connection.py:215 in │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ _new_conn │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 212 │ │ │ ) from e │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 213 │ │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 214 │ │ except OSError as e: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 215 │ │ │ raise NewConnectionError( │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 216 │ │ │ │ self, f"Failed to establish a new connection: {e}" │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 217 │ │ │ ) from e │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 218 │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | ╰──────────────────────────────────────────────────────────────────────────────╯ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | NewConnectionError: : Failed to establish a new connection: [Errno 111] Connection fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | refused fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | The above exception was the direct cause of the following exception: fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | ╭───────────────────── Traceback (most recent call last) ──────────────────────╮ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/local/lib/python3.8/dist-packages/requests/adapters.py:486 in send │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 483 │ │ │ timeout = TimeoutSauce(connect=timeout, read=timeout) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 484 │ │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 485 │ │ try: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 486 │ │ │ resp = conn.urlopen( │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 487 │ │ │ │ method=request.method, │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 488 │ │ │ │ url=url, │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 489 │ │ │ │ body=request.body, │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py:844 in │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ urlopen │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 841 │ │ │ elif isinstance(new_e, (OSError, HTTPException)): │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 842 │ │ │ │ new_e = ProtocolError("Connection aborted.", new_e) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 843 │ │ │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 844 │ │ │ retries = retries.increment( │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 845 │ │ │ │ method, url, error=new_e, _pool=self, _stacktrace=sys │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 846 │ │ │ ) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 847 │ │ │ retries.sleep() │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/local/lib/python3.8/dist-packages/urllib3/util/retry.py:515 in │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ increment │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 512 │ │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 513 │ │ if new_retry.is_exhausted(): │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 514 │ │ │ reason = error or ResponseError(cause) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 515 │ │ │ raise MaxRetryError(_pool, url, reason) from reason # typ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 516 │ │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 517 │ │ log.debug("Incremented Retry for (url='%s'): %r", url, new_ret │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 518 │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | ╰──────────────────────────────────────────────────────────────────────────────╯ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | MaxRetryError: HTTPConnectionPool(host='localhost', port=21001): Max retries fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | exceeded with url: /refresh_all_workers (Caused by fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | refused')) fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | During handling of the above exception, another exception occurred: fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | ╭───────────────────── Traceback (most recent call last) ──────────────────────╮ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/lib/python3.8/runpy.py:194 in _run_module_as_main │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 191 │ main_globals = sys.modules["__main__"].__dict__ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 192 │ if alter_argv: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 193 │ │ sys.argv[0] = mod_spec.origin │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 194 │ return _run_code(code, main_globals, None, │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 195 │ │ │ │ │ "__main__", mod_spec) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 196 │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 197 def run_module(mod_name, init_globals=None, │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/lib/python3.8/runpy.py:87 in _run_code │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 84 │ │ │ │ │ __loader__ = loader, │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 85 │ │ │ │ │ __package__ = pkg_name, │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 86 │ │ │ │ │ __spec__ = mod_spec) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 87 │ exec(code, run_globals) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 88 │ return run_globals │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 89 │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 90 def _run_module_code(code, init_globals=None, │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/local/lib/python3.8/dist-packages/fastchat/serve/gradio_web_server.py:6 │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 86 in │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 683 │ logger.info(f"args: {args}") │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 684 │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 685 │ set_global_vars(args.controller_url, args.moderate) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 686 │ models = get_model_list(args.controller_url) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 687 │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 688 │ if args.add_chatgpt: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 689 │ │ models = ["gpt-3.5-turbo", "gpt-4"] + models │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/local/lib/python3.8/dist-packages/fastchat/serve/gradio_web_server.py:1 │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 08 in get_model_list │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 105 │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 106 │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 107 def get_model_list(controller_url): │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 108 │ ret = requests.post(controller_url + "/refresh_all_workers") │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 109 │ assert ret.status_code == 200 │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 110 │ ret = requests.post(controller_url + "/list_models") │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 111 │ models = ret.json()["models"] │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/local/lib/python3.8/dist-packages/requests/api.py:115 in post │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 112 │ :rtype: requests.Response │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 113 │ """ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 114 │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 115 │ return request("post", url, data=data, json=json, **kwargs) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 116 │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 117 │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 118 def put(url, data=None, **kwargs): │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/local/lib/python3.8/dist-packages/requests/api.py:59 in request │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 56 │ # avoid leaving sockets open which can trigger a ResourceWarning i │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 57 │ # cases, and look like a memory leak in others. │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 58 │ with sessions.Session() as session: │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 59 │ │ return session.request(method=method, url=url, **kwargs) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 60 │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 61 │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 62 def get(url, params=None, **kwargs): │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/local/lib/python3.8/dist-packages/requests/sessions.py:589 in request │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 586 │ │ │ "allow_redirects": allow_redirects, │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 587 │ │ } │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 588 │ │ send_kwargs.update(settings) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 589 │ │ resp = self.send(prep, **send_kwargs) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 590 │ │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 591 │ │ return resp │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 592 │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ /usr/local/lib/python3.8/dist-packages/requests/sessions.py:703 in send │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 700 │ │ start = preferred_clock() │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 701 │ │ │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 702 │ │ # Send the request │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ ❱ 703 │ │ r = adapter.send(request, **kwargs) │ fastchat-fastchat-gradio-server-1 | 2023-06-02 15:20:49 | ERROR | stderr | │ 704 │ │ │ ```

Anyone has a suggestion to get it working? It'd be so cool!

darribas commented 1 year ago

UPDATE: solved it by removing the localhost bit implicit in the entrypoint:

      entrypoint: ["python3", "-m", "fastchat.serve.gradio_web_server", "--host", "0.0.0.0", "--port", "8001", "--controller-url", "http://fastchat-controller:21001"]
ghost commented 1 year ago
 entrypoint: ["python3.9" 

 worked for me
samadwar commented 1 year ago

I tried below docker-compose.yaml

version: "3.9"

services:
  fastchat-controller:
    build:
      context: .
      dockerfile: Dockerfile
    image: fastchat:latest
    ports:
      - "21001:21001"
    entrypoint: ["python3.9", "-m", "fastchat.serve.controller", "--host", "0.0.0.0", "--port", "21001"]
  fastchat-model-worker:
    build:
      context: .
      dockerfile: Dockerfile
    volumes:
      - huggingface:/root/.cache/huggingface
    image: fastchat:latest
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    entrypoint: ["python3.9", "-m", "fastchat.serve.model_worker", "--model-names", "${FASTCHAT_WORKER_MODEL_NAMES:-vicuna-7b-v1.3}", "--model-path", "${FASTCHAT_WORKER_MODEL_PATH:-lmsys/vicuna-7b-v1.3}", "--worker-address", "http://fastchat-model-worker:21002", "--controller-address", "http://fastchat-controller:21001", "--host", "0.0.0.0", "--port", "21002"]
  fastchat-gradio-server:
    build:
      context: .
      dockerfile: Dockerfile
    image: fastchat:latest
    ports:
      - "8000:8000"
    entrypoint: ["python3.9", "-m", "fastchat.serve.gradio_web_server", "--controller-url", "http://fastchat-controller:21001", "--host", "0.0.0.0", "--port", "8000"]
volumes:
  huggingface:

And after worker and controller are up, I need to restart controller to see the model. I do that with docker compose restart fastchat-gradio-server

However, I don't see a text box or edit box after that. The queuing symbol keeps spinning.

image

Does anyone know why?

I am running cb04e95

samadwar commented 1 year ago

Got it working after few changes:

version: "3.9"

services:
  fastchat-controller:
    build:
      context: .
      dockerfile: Dockerfile
    image: fastchat:latest
    ports:
      - "21001:21001"
    entrypoint: ["python3.9", "-m", "fastchat.serve.controller", "--host", "0.0.0.0", "--port", "21001"]
  fastchat-model-worker:
    build:
      context: .
      dockerfile: Dockerfile
    volumes:
      - huggingface:/root/.cache/huggingface
    environment:
      FASTCHAT_CONTROLLER_URL: http://fastchat-controller:21001
    image: fastchat:latest
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    entrypoint: ["python3.9", "-m", "fastchat.serve.model_worker", "--model-names", "${FASTCHAT_WORKER_MODEL_NAMES:-vicuna-7b-v1.3}", "--model-path", "${FASTCHAT_WORKER_MODEL_PATH:-lmsys/vicuna-7b-v1.3}", "--worker-address", "http://fastchat-model-worker:21002", "--controller-address", "http://fastchat-controller:21001", "--host", "0.0.0.0", "--port", "21002"]
  fastchat-api-server:
    build:
      context: .
      dockerfile: Dockerfile
    environment:
      FASTCHAT_CONTROLLER_URL: http://fastchat-controller:21001
    image: fastchat:latest
    ports:
      - "8000:8000"
    entrypoint: ["python3.9", "-m", "fastchat.serve.openai_api_server", "--controller-address", "http://fastchat-controller:21001", "--host", "0.0.0.0", "--port", "8000"]
  fastchat-gradio-server:
    build:
      context: .
      dockerfile: Dockerfile
    environment:
      FASTCHAT_CONTROLLER_URL: http://fastchat-controller:21001
    image: fastchat:latest
    depends_on:
        fastchat-controller:
          condition: service_started
        fastchat-model-worker:
          condition: service_started
        fastchat-api-server:
          condition: service_started
    ports:
      - "8001:8001"
    entrypoint: ["python3.9", "-m", "fastchat.serve.gradio_web_server", "--controller-url", "http://fastchat-controller:21001", "--host", "0.0.0.0", "--port", "8001", "--model-list-mode", "reload"]
volumes:
  huggingface:

And Dockerfile:

FROM nvidia/cuda:11.7.1-runtime-ubuntu20.04

RUN apt-get update -y && apt-get install -y python3.9 python3.9-distutils curl
RUN curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
RUN python3.9 get-pip.py
RUN pip3 install fschat pydantic==1.10.1
RUN pip3 install --force-reinstall typing-extensions==4.5.0
thekevshow commented 1 year ago

Got it working after few changes:

version: "3.9"

services:
  fastchat-controller:
    build:
      context: .
      dockerfile: Dockerfile
    image: fastchat:latest
    ports:
      - "21001:21001"
    entrypoint: ["python3.9", "-m", "fastchat.serve.controller", "--host", "0.0.0.0", "--port", "21001"]
  fastchat-model-worker:
    build:
      context: .
      dockerfile: Dockerfile
    volumes:
      - huggingface:/root/.cache/huggingface
    environment:
      FASTCHAT_CONTROLLER_URL: http://fastchat-controller:21001
    image: fastchat:latest
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    entrypoint: ["python3.9", "-m", "fastchat.serve.model_worker", "--model-names", "${FASTCHAT_WORKER_MODEL_NAMES:-vicuna-7b-v1.3}", "--model-path", "${FASTCHAT_WORKER_MODEL_PATH:-lmsys/vicuna-7b-v1.3}", "--worker-address", "http://fastchat-model-worker:21002", "--controller-address", "http://fastchat-controller:21001", "--host", "0.0.0.0", "--port", "21002"]
  fastchat-api-server:
    build:
      context: .
      dockerfile: Dockerfile
    environment:
      FASTCHAT_CONTROLLER_URL: http://fastchat-controller:21001
    image: fastchat:latest
    ports:
      - "8000:8000"
    entrypoint: ["python3.9", "-m", "fastchat.serve.openai_api_server", "--controller-address", "http://fastchat-controller:21001", "--host", "0.0.0.0", "--port", "8000"]
  fastchat-gradio-server:
    build:
      context: .
      dockerfile: Dockerfile
    environment:
      FASTCHAT_CONTROLLER_URL: http://fastchat-controller:21001
    image: fastchat:latest
    depends_on:
        fastchat-controller:
          condition: service_started
        fastchat-model-worker:
          condition: service_started
        fastchat-api-server:
          condition: service_started
    ports:
      - "8001:8001"
    entrypoint: ["python3.9", "-m", "fastchat.serve.gradio_web_server", "--controller-url", "http://fastchat-controller:21001", "--host", "0.0.0.0", "--port", "8001", "--model-list-mode", "reload"]
volumes:
  huggingface:

And Dockerfile:

FROM nvidia/cuda:11.7.1-runtime-ubuntu20.04

RUN apt-get update -y && apt-get install -y python3.9 python3.9-distutils curl
RUN curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
RUN python3.9 get-pip.py
RUN pip3 install fschat pydantic==1.10.1
RUN pip3 install --force-reinstall typing-extensions==4.5.0

Hey this is great, I am wanting to try to run the llama-2 models, do you know how we would tweak this to allow for authentication and load a different model. I haven't deep dove into this repo, perhaps other files have to be changed as well?