ray-project / ray-llm

RayLLM - LLMs on Ray
https://aviary.anyscale.com
Apache License 2.0
1.22k stars 87 forks source link

issue with run locally #61

Open omlomloml opened 11 months ago

omlomloml commented 11 months ago

I try to run inside the latest image, but after the model warmup, it just died with no error. I was trying to run this aviary run --model ~/models/continuous_batching/mosaicml--mpt-7b-chat.yaml the only change inside the yaml is to remove ray_actor_options: num_gpus: 1 since I don't have 'accelerator_type_a10', I have a6000 here is the last of the logs

ve taken more than 30s to initialize. This may be caused by a slow __init__ or reconfigure method.
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) Downloaded /home/ray/data/hub/models--mosaicml--mpt-7b-chat/snapshots/64e5c9c9fb53a8e89690c2dee75a5add37f7113e/pytorch_model-00001-of-00002.bin in 0:02:35.
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) Download: [1/2] -- ETA: 0:02:35
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) Download file: pytorch_model-00002-of-00002.bin
(ServeController pid=30116) WARNING 2023-10-02 06:40:38,770 controller 30116 deployment_state.py:2006 - Deployment 'mosaicml--mpt-7b-chat' in application 'mosaicml--mpt-7b-chat' has 1 replicas that have taken more than 30s to initialize. This may be caused by a slow __init__ or reconfigure method.
(ServeController pid=30116) WARNING 2023-10-02 06:41:08,775 controller 30116 deployment_state.py:2006 - Deployment 'mosaicml--mpt-7b-chat' in application 'mosaicml--mpt-7b-chat' has 1 replicas that have taken more than 30s to initialize. This may be caused by a slow __init__ or reconfigure method.
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) Downloaded /home/ray/data/hub/models--mosaicml--mpt-7b-chat/snapshots/64e5c9c9fb53a8e89690c2dee75a5add37f7113e/pytorch_model-00002-of-00002.bin in 0:00:58.
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) Download: [2/2] -- ETA: 0
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) No safetensors weights found for model mosaicml/mpt-7b-chat at revision None. Converting PyTorch weights to safetensors.
(ServeController pid=30116) WARNING 2023-10-02 06:41:38,862 controller 30116 deployment_state.py:2006 - Deployment 'mosaicml--mpt-7b-chat' in application 'mosaicml--mpt-7b-chat' has 1 replicas that have taken more than 30s to initialize. This may be caused by a slow __init__ or reconfigure method.
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) Convert: [1/2] -- Took: 0:00:20.415345
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) Convert: [2/2] -- Took: 0:00:06.243851
(ServeReplica:mosaicml--mpt-7b-chat:mosaicml--mpt-7b-chat pid=30186) [INFO 2023-10-02 06:42:05,045] tgi.py: 214  Warming up model on workers...
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) [INFO 2023-10-02 06:42:05,054] tgi_worker.py: 650  Model is warming up. Num requests: 3 Prefill tokens: 6000 Max batch total tokens: None
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) [INFO 2023-10-02 06:42:07,307] tgi_worker.py: 663  Model finished warming up (max_batch_total_tokens=None) and is ready to serve requests.
(ServeReplica:mosaicml--mpt-7b-chat:mosaicml--mpt-7b-chat pid=30186) [INFO 2023-10-02 06:42:07,520] tgi.py: 170  Rolling over to new worker group [Actor(AviaryTGIInferenceWorker, 725292a8070301f947130c2c01000000)]
(ServeReplica:mosaicml--mpt-7b-chat:mosaicml--mpt-7b-chat pid=30186) [INFO 2023-10-02 06:42:07,661] model_app.py: 83  Reconfigured and ready to serve.
(ServeReplica:mosaicml--mpt-7b-chat:mosaicml--mpt-7b-chat pid=30186) DeprecationWarning: `ray.state.actors` is a private attribute and access will be removed in a future Ray version.
/home/ray/anaconda3/lib/python3.9/tempfile.py:821: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmptyp67o3t'>
  _warnings.warn(warn_message, ResourceWarning)
/home/ray/anaconda3/lib/python3.9/subprocess.py:1052: ResourceWarning: subprocess 28960 is still running
  _warn("subprocess %s is still running" % self.pid,
ResourceWarning: Enable tracemalloc to get the object allocation traceback

(base) ray@4cd79d6dad32:~$

nobody4t commented 10 months ago

Do you have a process still running at background?

ResourceWarning: subprocess 28960 is still running