ray-project / ray-llm

RayLLM - LLMs on Ray
https://aviary.anyscale.com
Apache License 2.0
1.22k stars 87 forks source link

rayllm's frontend can't work properly via rayllm:0.4.0 image #89

Closed k0286 closed 9 months ago

k0286 commented 10 months ago

Reproduce procedure

  1. deploy rayllm locally
    
    cache_dir=${XDG_CACHE_HOME:-$HOME/.cache}

docker run -it --gpus all --shm-size 1g -p 8000:8000 -e HF_HOME=~/data -v $cache_dir:~/data anyscale/ray-llm:0.4.0 bash

Inside docker container

serve run ~/serve_configs/amazon--LightGPT.yaml --host 0.0.0.0 --non-blocking export AVIARY_URL=http://localhost:8000 serve run rayllm.frontend.app:app --blocking --host 0.0.0.0

2. And get the following error message
```shell
(ServeController pid=683)   File "/home/ray/anaconda3/lib/python3.9/concurrent/futures/_base.py", line 439, in result                                                                       (ServeController pid=683)     return self.__get_result()                                                                                                                                    (ServeController pid=683)   File "/home/ray/anaconda3/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result                                                                 
(ServeController pid=683)     raise self._exception                                                                                                                                         
(ServeController pid=683)   File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/replica.py", line 442, in initialize_and_get_metadata                                  
(ServeController pid=683)     raise RuntimeError(traceback.format_exc()) from None                                                                                                          (ServeController pid=683) RuntimeError: Traceback (most recent call last):                                                                                                                  
(ServeController pid=683)   File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/replica.py", line 430, in initialize_and_get_metadata                                  (ServeController pid=683)     await self._initialize_replica()                                                                                                                              
(ServeController pid=683)   File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/replica.py", line 190, in initialize_replica
(ServeController pid=683)     await sync_to_async(_callable.__init__)(*init_args, **init_kwargs)
(ServeController pid=683)   File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/frontend/app.py", line 470, in __init__
(ServeController pid=683)     blocks = builder()
(ServeController pid=683)   File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/frontend/app.py", line 323, in gradio_app_builder
(ServeController pid=683)     JavaScriptLoader()
(ServeController pid=683)   File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/frontend/javascript_loader.py", line 38, in __init__
(ServeController pid=683)     self.load_js()
(ServeController pid=683)   File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/frontend/javascript_loader.py", line 42, in load_js
(ServeController pid=683)     js_scripts = ScriptLoader.get_scripts(self.path, self.script_type)
(ServeController pid=683)   File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/frontend/javascript_loader.py", line 25, in get_scripts
(ServeController pid=683)     dir_list = [os.path.join(path, f) for f in os.listdir(path)]
(ServeController pid=683) FileNotFoundError: [Errno 2] No such file or directory: '/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/frontend/javascript'
  1. Add javascript/aviary.js from repo to /home/ray/anaconda3/lib/python3.9/site-packages/rayllm/frontend/ manually, then get the following error message
    (ServeController pid=2203) ERROR 2023-11-02 19:24:56,145 controller 2203 deployment_state.py:617 - Exception in replica 'default#AviaryFrontend#sVQxkR', the replica will be stopped.       (ServeController pid=2203) Traceback (most recent call last):                                                                                                                               
    (ServeController pid=2203)   File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/deployment_state.py", line 615, in check_ready                                        (ServeController pid=2203)     _, self._version = ray.get(self._ready_obj_ref)                                                                                                              
    (ServeController pid=2203)   File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/_private/auto_init_hook.py", line 24, in auto_init_wrapper                                           
    (ServeController pid=2203)     return fn(*args, **kwargs)                                                                                                                                   (ServeController pid=2203)   File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper                                                  
    (ServeController pid=2203)     return func(*args, **kwargs)                                                                                                                                 (ServeController pid=2203)   File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/_private/worker.py", line 2547, in get                                                               
    (ServeController pid=2203)     raise value.as_instanceof_cause()                                                                                                                            
    (ServeController pid=2203) ray.exceptions.RayTaskError(RuntimeError): ray::ServeReplica:default:AviaryFrontend.initialize_and_get_metadata() (pid=2617, ip=172.23.0.3, actor_id=80f2435baff9090308abf9ee08000000, repr=<ray.serve._private.replica.ServeReplica:default:AviaryFrontend object at 0x7f54fdad1130>)
    (ServeController pid=2203)   File "/home/ray/anaconda3/lib/python3.9/concurrent/futures/_base.py", line 446, in result
    (ServeController pid=2203)     return self.__get_result()
    (ServeController pid=2203)   File "/home/ray/anaconda3/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    (ServeController pid=2203)     raise self._exception
    (ServeController pid=2203)   File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/replica.py", line 442, in initialize_and_get_metadata
    (ServeController pid=2203)     raise RuntimeError(traceback.format_exc()) from None
    (ServeController pid=2203) RuntimeError: Traceback (most recent call last):
    (ServeController pid=2203)   File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/replica.py", line 430, in initialize_and_get_metadata
    (ServeController pid=2203)     await self._initialize_replica()
    (ServeController pid=2203)   File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/replica.py", line 190, in initialize_replica
    (ServeController pid=2203)     await sync_to_async(_callable.__init__)(*init_args, **init_kwargs)
    (ServeController pid=2203)   File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/frontend/app.py", line 487, in __init__
    (ServeController pid=2203)     blocks._queue.set_url(f"http://localhost:{port}{route_prefix}/")
    (ServeController pid=2203) AttributeError: 'Queue' object has no attribute 'set_url'
  2. edit /home/ray/anaconda3/lib/python3.9/site-packages/rayllm/frontend/app.py, comment lines 487-488 and restart frontend
    487         #blocks._queue.set_url(f"http://localhost:{port}{route_prefix}/")
    488         #blocks._queue.set_url = noop
  3. it seems like startup success in the terminal, but the page is broken by visiting from browser