xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
3.79k stars 322 forks source link

BUG: Could not start custom registered model: sdxl-turbo #1620

Open majestichou opened 1 month ago

majestichou commented 1 month ago

Describe the bug

Due to network restrictions, I cannot use Xinference to pull models online. I downloaded the model weight of sdxl-turbo to the local computer, and then used Xinference (docker container) to register the model sdxl-turbo to sdxl-turbo-self. After that, I started the custom model sdxl-turbo-self. The startup failed, and the following error message was displayed: Server error: 5000 -[address=0.0.0.0:58812, pid=95] list index out of range

To Reproduce

To help us to reproduce this bug, please provide information below:

  1. I use docker image--xprobe/xinference:v0.12.0
  2. Full stack of the error is as belows:
    
    2024-06-12 10:56:56,758 xinference.core.supervisor 95 INFO     Xinference supervisor 0.0.0.0:30305 started
    2024-06-12 10:56:58,503 xinference.core.worker 95 INFO     Starting metrics export server at 0.0.0.0:None
    2024-06-12 10:56:58,506 xinference.core.worker 95 INFO     Checking metrics export server...
    2024-06-12 10:57:01,299 xinference.core.worker 95 INFO     Metrics server is started at: http://0.0.0.0:38729
    2024-06-12 10:57:01,302 xinference.core.worker 95 INFO     Xinference worker 0.0.0.0:30305 started
    2024-06-12 10:57:01,303 xinference.core.worker 95 INFO     Purge cache directory: /root/.xinference/cache
    2024-06-12 10:57:03,238 xinference.api.restful_api 1 INFO     Starting Xinference at endpoint: http://0.0.0.0:9997
    2024-06-12 10:59:36,355 xinference.model.image.core 95 WARNING  Cannot find builtin image model spec: sdxl-turbo-self
    2024-06-12 10:59:36,468 xinference.model.image.core 1 WARNING  Cannot find builtin image model spec: sdxl-turbo-self
    2024-06-12 10:59:42,323 xinference.model.utils 95 INFO     Model caching from URI: /root/models/sdxl-turbo
    2024-06-12 10:59:42,324 xinference.core.worker 95 ERROR    Failed to load model sdxl-turbo-self-1-0
    Traceback (most recent call last):
    File "/opt/conda/lib/python3.10/site-packages/xinference/core/worker.py", line 654, in launch_builtin_model
    await self.update_cache_status(model_name, model_description)
    File "/opt/conda/lib/python3.10/site-packages/xinference/core/worker.py", line 552, in update_cache_status
    model_path = version_info[0]["model_file_location"]
    IndexError: list index out of range
    2024-06-12 10:59:42,360 xinference.api.restful_api 1 ERROR    [address=0.0.0.0:30305, pid=95] list index out of range
    Traceback (most recent call last):
    File "/opt/conda/lib/python3.10/site-packages/xinference/api/restful_api.py", line 750, in launch_model
    model_uid = await (await self._get_supervisor_ref()).launch_builtin_model(
    File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/context.py", line 227, in send
    return self._process_result_message(result)
    File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
    raise message.as_instanceof_cause()
    File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/pool.py", line 659, in send
    result = await self._run_coro(message.message_id, coro)
    File "/opt/conda/lib/python3.10/site-packages/xoscar/backends/pool.py", line 370, in _run_coro
    return await coro
    File "/opt/conda/lib/python3.10/site-packages/xoscar/api.py", line 384, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
    File "xoscar/core.pyx", line 558, in __on_receive__
    raise ex
    File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
    async with self._lock:
    File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
    with debug_async_timeout('actor_lock_timeout',
    File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
    result = await result
    File "/opt/conda/lib/python3.10/site-packages/xinference/core/supervisor.py", line 837, in launch_builtin_model
    await _launch_model()
    File "/opt/conda/lib/python3.10/site-packages/xinference/core/supervisor.py", line 801, in _launch_model
    await _launch_one_model(rep_model_uid)
    File "/opt/conda/lib/python3.10/site-packages/xinference/core/supervisor.py", line 782, in _launch_one_model
    await worker_ref.launch_builtin_model(
    File "xoscar/core.pyx", line 284, in __pyx_actor_method_wrapper
    async with lock:
    File "xoscar/core.pyx", line 287, in xoscar.core.__pyx_actor_method_wrapper
    result = await result
    File "/opt/conda/lib/python3.10/site-packages/xinference/core/utils.py", line 45, in wrapped
    ret = await func(*args, **kwargs)
    File "/opt/conda/lib/python3.10/site-packages/xinference/core/worker.py", line 654, in launch_builtin_model
    await self.update_cache_status(model_name, model_description)
    File "/opt/conda/lib/python3.10/site-packages/xinference/core/worker.py", line 552, in update_cache_status
    model_path = version_info[0]["model_file_location"]
    IndexError: [address=0.0.0.0:30305, pid=95] list index out of range


**The error message "2024-06-12 10:59:42,324 xinference.core.worker 95 ERROR failed to load model sdxl-turbo-self-1-0" is displayed.
I registered the model name is clearly sdxl-turbo-self, why xinference started sdxl-turbo-self-1-0, this is too strange.**

3. reproduce step:

- Download the sdxl-turbo model from huggingface to a local directory named `/home/llm/image-model/`
- Run the following command to start the container: `docker run -d -v /home/llm/image-model:/root/models -p 9998:9997 --gpus all xprobe/xinference:v0.12.0 xinference-local -H 0.0.0.0`
- Use the browser to access http://localhost:9998/ui
- Click Register Model, select the IMAGE MODEL tab, enter the model name "sdxl-turbo-self" and path "/root/models/sdxl-turbo", and click Register Model.
- Launch the model: sdxl-turbo-self
- Failed to load the model. And the error info is `Server error: 5000 -[address=0.0.0.0:58812, pid=95] list index out of range`

### Expected behavior
start custom registered model: sdxl-turbo
qinxuye commented 1 month ago

@amumu96 Could you help to see the issue?

majestichou commented 3 weeks ago

@amumu96 The error still exist with the model sd3-medium, I downloaded the model weight of sd3-medium (which model id is "stabilityai/stable-diffusion-3-medium", i don't download "stabilityai/stable-diffusion-3-medium-diffusers" ) to the local computer, and then used Xinference (v0.12.3,docker container) to register the model sd3-medium to sd3-medium-self. After that, I started the custom model sd3-medium-self. The startup failed, and the following error message was displayed: IndexError: [address=0.0.0.0:13144, pid=97] list index out of range

Valdanitooooo commented 2 weeks ago

Same here

bilzeng commented 1 week ago

我也遇到了这个问题 2024-07-10 16:42:21,580 xinference.model.utils 100680 INFO Model caching from URI: /data/model/AI-ModelScope/stable-diffusion-3-medium 2024-07-10 16:42:21,581 xinference.model.utils 100680 INFO cache /data/model/AI-ModelScope/stable-diffusion-3-medium exists 2024-07-10 16:42:21,581 xinference.core.worker 100680 ERROR Failed to load model stable-diffusion-3-medium-1-0 Traceback (most recent call last): File "/data/conda/envs/xinference_env/lib/python3.8/site-packages/xinference/core/worker.py", line 663, in launch_builtin_model await self.update_cache_status(model_name, model_description) File "/data/conda/envs/xinference_env/lib/python3.8/site-packages/xinference/core/worker.py", line 561, in update_cache_status model_path = version_info[0]["model_file_location"] IndexError: list index out of range 2024-07-10 16:42:21,623 xinference.api.restful_api 100579 ERROR [address=0.0.0.0:18173, pid=100680] list index out of range Traceback (most recent call last): File "/data/conda/envs/xinference_env/lib/python3.8/site-packages/xinference/api/restful_api.py", line 822, in launch_model model_uid = await (await self._get_supervisor_ref()).launch_builtin_model( File "/data/conda/envs/xinference_env/lib/python3.8/site-packages/xoscar/backends/context.py", line 227, in send return self._process_result_message(result) File "/data/conda/envs/xinference_env/lib/python3.8/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() File "/data/conda/envs/xinference_env/lib/python3.8/site-packages/xoscar/backends/pool.py", line 659, in send result = await self._run_coro(message.message_id, coro) File "/data/conda/envs/xinference_env/lib/python3.8/site-packages/xoscar/backends/pool.py", line 370, in _run_coro return await coro File "/data/conda/envs/xinference_env/lib/python3.8/site-packages/xoscar/api.py", line 384, in on_receive return await super().on_receive(message) # type: ignore File "xoscar/core.pyx", line 558, in on_receive__ raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive async with self._lock: File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive result = await result File "/data/conda/envs/xinference_env/lib/python3.8/site-packages/xinference/core/supervisor.py", line 871, in launch_builtin_model await _launch_model() File "/data/conda/envs/xinference_env/lib/python3.8/site-packages/xinference/core/supervisor.py", line 835, in _launch_model await _launch_one_model(rep_model_uid) File "/data/conda/envs/xinference_env/lib/python3.8/site-packages/xinference/core/supervisor.py", line 816, in _launch_one_model await worker_ref.launch_builtin_model( File "xoscar/core.pyx", line 284, in __pyx_actor_method_wrapper async with lock: File "xoscar/core.pyx", line 287, in xoscar.core.__pyx_actor_method_wrapper result = await result File "/data/conda/envs/xinference_env/lib/python3.8/site-packages/xinference/core/utils.py", line 45, in wrapped ret = await func(*args, **kwargs) File "/data/conda/envs/xinference_env/lib/python3.8/site-packages/xinference/core/worker.py", line 663, in launch_builtin_model await self.update_cache_status(model_name, model_description) File "/data/conda/envs/xinference_env/lib/python3.8/site-packages/xinference/core/worker.py", line 561, in update_cache_status model_path = version_info[0]["model_file_location"] IndexError: [address=0.0.0.0:18173, pid=100680] list index out of range