Closed Matrixxxxxxxx closed 2 months ago
不知道上述问题各位开发者是否考虑过?还是我配置的错误导致了上述问题?我所有的安装都是你们官方的readme一步一步来的,感觉应该不是环境配置出的问题 / Has anyone considered the above issue, or is it possible that the problem is caused by an error in my configuration? I followed the official README step by step for all installations, so it seems unlikely that it is an environment configuration issue.
Transformers 更新太快了,而模型一旦发布下一代,就大概率不维护了,它需要的 transformers 版本也就停留在了古早的版本,这个很难解决。
我觉得可以通过 xinf 分布式来解决,不同 worker 起的环境有不同的 transformers 版本。加载模型可以通过指定 worker_ip 来到指定的 worker上。
先关闭,我认为对我们来说没有解法。
System Info / 系統信息
(xinference) ub@ub-OMEN-by-HP-Laptop-17-ck2xxx:~$ pip show torch Name: torch Version: 2.3.1 Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration Home-page: https://pytorch.org/ Author: PyTorch Team Author-email: packages@pytorch.org License: BSD-3 Location: /home/ub/miniconda3/envs/xinference/lib/python3.10/site-packages Requires: filelock, fsspec, jinja2, networkx, nvidia-cublas-cu12, nvidia-cuda-cupti-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-runtime-cu12, nvidia-cudnn-cu12, nvidia-cufft-cu12, nvidia-curand-cu12, nvidia-cusolver-cu12, nvidia-cusparse-cu12, nvidia-nccl-cu12, nvidia-nvtx-cu12, sympy, triton, typing-extensions Required-by: accelerate, auto_gptq, autoawq, autoawq_kernels, bitsandbytes, optimum, peft, sentence-transformers, timm, torchaudio, torchvision, xinference
(xinference) ub@ub-OMEN-by-HP-Laptop-17-ck2xxx:~$ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2021 NVIDIA Corporation Built on Thu_Nov_18_09:45:30_PST_2021 Cuda compilation tools, release 11.5, V11.5.119 Build cuda_11.5.r11.5/compiler.30672275_0
(xinference) ub@ub-OMEN-by-HP-Laptop-17-ck2xxx:~$ python --version Python 3.10.9
(xinference) ub@ub-OMEN-by-HP-Laptop-17-ck2xxx:~$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 22.04.4 LTS Release: 22.04 Codename: jammy
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
(xinference) ub@ub-OMEN-by-HP-Laptop-17-ck2xxx:~$ pip show xinference Name: xinference Version: 0.13.3
The command used to start Xinference / 用以启动 xinference 的命令
xinference-local --host 0.0.0.0 --port 9997
Reproduction / 复现过程
部署llama3.1-instruction的时候,推理过程需要最新的transformer版本Version: 4.43.3
但是对于其他大模型,比如ChatGLM-2这一类的,需要的transformer版本又不一样(经过测试transformer==4.41.2是OK的对于这类模型),如果版本不匹配,那么在xinference推理的时候就会报一系列的错误
所以每次我想切换不同的大模型时,必须abort项目然后重新安装对应版本的transformer,然后再启动项目
When deploying Llama 3.1-Instructions, the inference process requires the latest transformer version: 4.43.3.
However, for other large models, such as ChatGLM-2, a different transformer version is needed (testing shows that transformer==4.41.2 works for these models). If the version does not match, a error will occur during inference with xinference.
Therefore, every time I want to switch to a different large model, I must abort the project, reinstall the corresponding version of the transformer, and then restart the project.
Expected behavior / 期待表现
How to solve this issue?