-
Hello! First of all, great job with this inference engine! Thanks a lot for your work!
Here's my issue: I have run vllm with both a mistral instruct model and it's AWQ quantized version. I've quant…
-
I‘m trying to follow [this](https://github.com/mit-han-lab/llm-awq#install) to install awq.
But failed at step 3.
## My Env
```
OS: Windows 11
GPU: NVIDIA GeForce RTX4060
Driver Version: 536.4…
-
When I quantified the Qwen2.5-1.5B-instruct model according to **"Quantizing the GGUF with AWQ Scale"** of [docs](https://qwen.readthedocs.io/en/latest/quantization/llama.cpp.html) , it showed that th…
-
I have faced an error with the VLLM framework when I tried to inferencing an Unsloth fine-tuned LLAMA3-8b model...
### Error:
(venv) ubuntu@ip-192-168-68-10:~/ans/vllm-server$ python -O -u -m vl…
-
I'm trying to quantize llava-1.5 according to the `readme.md` with the following scripts, and tells:`AttributeError: 'LlavaConfig' object has no attribute 'mm_vision_tower'`.
It seems like the llava…
-
Hi
Im trying to do inference on a awq quantized model and im constantly getting this error when trying to generate text.
Im using Qwen2.5-72B-Instruct-AWQ.
Some code to give context:
sel…
-
### System Info
GPU: 4090
Tensorrt: 10.3
tensorrt-llm: 0.13.0.dev2024081300
### Who can help?
@Tracin May you please have a look, thank you very much
### Information
- [ ] The official example sc…
-
### 🚀 The feature, motivation and pitch
Is the deepseek-v2 AWQ version supported now? When I run it, I get the following error:
```
[rank0]: File "/usr/local/lib/python3.9/dist-packages/vllm/mo…
-
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[2], [line 2](vscode-notebook-cell:?exe…
-
### System Info
- Ubuntu 20.04
- NVIDIA A100
### Who can help?
@Tracin @kaiyux
### Information
- [X] The official example scripts
- [ ] My own modified scripts
### Tasks
- [ ] A…