-
### Your current environment
```text
The output of `python env.py`
```
### How did you install Aphrodite?
When using pre conversion i ran into following error
ValueError: 17 is not a valid GGM…
-
Bonjour,
J'ai partagé un dossier à l'autre utilisateur, mais quand j'ai voulu arrêter ce partage en utilisant "regen the share link", le lien de partage reste toujours dans cet utilisateur. Comment f…
ilood updated
8 years ago
-
## Please checkout [Announcing Llama 3.1 Support in vLLM](https://blog.vllm.ai/2024/07/23/llama31.html) ##
* Chunked prefill is turned on for all Llama 3.1 models. However, it is currently incompat…
-
你好,按你博客编译出了ffmpeg.so库,但是执行ffmpeg的命令行成功后却还是报错崩溃,看了日志也没看出哪出问题了,请教一下你
**日志如下:**
ffmpeg version 3.3 Copyright (c) 2000-2017 the FFmpeg developers
built with gcc 4.9 (GCC) 20140827 (prerelease)
c…
-
### bug描述 Describe the Bug
### 问题描述
基于 develop 分支编包,EfficientNetB0、GhostNet_x1_0 与 MobileNetV1 等多个预测模型,多卡启动 自动压缩产出量化模型报错:IndexError: list index out of range
```
Traceback (most recent call last)…
-
I use awq to quantize llama 2 70b-chat by:
```
CUDA_VISIBLE_DEVICES="1,2,3,4,5,6,7" python quantize_llama.py
```
the codes of quantize_llama.py:
```
from awq import AutoAWQForCausalLM
from tr…
-
### System Info
Hi,
I tried using ppo with gemma model but I get this error
I think the issue is here [is_encoder_decoder](https://github.com/huggingface/trl/blob/e90e8d91d2265e484f229c45a5eb8982f…
-
run quantize and save_quantized success. but load model to generate, get the AssertionError: Marlin kernels are not installed. Please install AWQ compatible Marlin kernels from AutoAWQ_kernels. The lo…
-
### Your current environment
```text
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
…
-
I am trying to fine-tune Llama 2 7B with QLoRA on 2 GPUs. From what I've read SFTTrainer should support multiple GPUs just fine, but when I run this I see one GPU with high utilization and one with al…