-
How do I run the chatgpt plugin locally and does it work or is still being developed?
-
After below code, is there an api(maybe like `llm.terminate`) to kill llm and release the GPU memory?
```
from vllm import LLM, SamplingParams
prompts = [
"Hello, my name is",
"The pres…
-
### 先决条件
- [X] 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。
- [X] 错误在 [最新版本](https://github.com/open-com…
-
### Your current environment
```text
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
…
-
I am facing difficulties in specifying GPU usage for different models for LLM inference pipeline using vLLM. Specifically, I have 4 RTX 4090 GPUs available, and I aim to run a LLM with a size of 42GB …
-
![image](https://github.com/InternLM/xtuner/assets/145842232/83f12831-573f-4a42-8f19-905e8a5d57e6)
How do I solve this problem? The error is as above, and the config is attached below
# Copyri…
-
nproc_per_node=4
CUDA_VISIBLE_DEVICES=0,1,2,3 \
NPROC_PER_NODE=$nproc_per_node \
swift sft \
--model_id_or_path "AI-ModelScope/llava-v1.6-mistral-7b" \
--template_type "llava-mistral-inst…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
pass
### Reproduction
```
CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6,7" accelerate launch \
--config_fil…
-
### Your current environment
```text
GPU 0: NVIDIA H100 80GB HBM3
GPU 1: NVIDIA H100 80GB HBM3
GPU 2: NVIDIA H100 80GB HBM3
GPU 3: NVIDIA H100 80GB HBM3
GPU 4: NVIDIA H100 80GB HBM3
GPU 5: NV…
-
### Your current environment
环境:
torch 2.3.0
vllm 0.5.0.post1
transformers 4.41.2
主要报错情况:
moe小一点的模型 '/data/models/qwen/qwen1.5-2.7Bmoe' 不会出问题
对于大一点的就报错如最下面。
代码:
from vllm.engine.arg_ut…