-
### Checklist
- [ ] 1. I have searched related issues but cannot get the expected help.
- [ ] 2. The bug has not been fixed in the latest version.
### Describe the bug
model_source: hf_model
WAR…
-
### Class | 类型
大语言模型
### Feature Request | 功能请求
我直接从运行的docker镜像,但是发现chatglm 和llama2的模型都只运行在了一个gpu上,我的机器有4*A5000, 剩下的三个卡都是空着的。所以想问下本项目是否支持多卡并行,好让我把四个卡都用起来或者跑更大的模型。
-
### Your current environment
```text
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Debian GNU/Linux 11 (bullseye) (x86…
-
我按照教程配置了所有的环境内容,但是一运行就报错..
后面我离线转换的也是报错这个
(lmdeploy) root@intern-studio:~# lmdeploy chat turbomind /share/temp/model_repos/internlm-chat-7b/ --model-name internlm-chat-7b
model_source: hf_model
…
-
### Checklist
- [X] 1. I have searched related issues but cannot get the expected help.
- [x] 2. The bug has not been fixed in the latest version.
### Describe the bug
Segmentation fault whe…
-
### Checklist
- [X] 1. I have searched related issues but cannot get the expected help.
- [X] 2. The bug has not been fixed in the latest version.
- [X] 3. Please note that if the bug-related iss…
-
### Checklist
- [x] 1. I have searched related issues but cannot get the expected help.
- [x] 2. The bug has not been fixed in the latest version.
### Describe the bug
lmdeploy serve api_server --s…
-
Does llama3 support inference and fine-tuning on multi-graphics card machines? Could you please add some sample code for a single machine with multiple cards?
-
使用官方提供的7B版本,单卡24G内存的RTX上无法运行,报OOM错误,指定卡号后无法生效,依然还是只占用第0卡,要怎么推理才可以正常运行
```python
import torch
from transformers import AutoModel, AutoTokenizer
torch.set_grad_enabled(False)
ckpt_path='/home/my/…
-
根据官方教程进行复现 进行微调时出现grad_norm:nan
参数配置如下:
# Model
pretrained_model_name_or_path = 'internlm/internlm2-chat-7b'
use_varlen_attn = False
# Data
data_path = 'data'
prompt_template = PROMPT_TEMPLAT…