-
When I try to evaluate the quantized AWQ models using the video evalaution script, I'm getting FileNotFoundError.
```
FileNotFoundError: No such file or directory: "/hfhub/hub/models--Efficient-La…
-
Just FYI, I think that the ```autoawq``` library only supports up to a certain version of the ```torch``` library pursuant to this message, which I received after (1) installing ```autoawq``` and (2) …
-
When run example.py, hit
RuntimeError: CUDA error: no kernel image is available for execution on the device (at /home/ubuntu/nunchaku/src/kernels/awq/gemv_awq.cu:311)
I'm on Lambda A100 GPU insta…
-
### Describe the issue
I am trying to enable AWQ support with IPEX repo in CPU.
IPEX 2.5.0 [release](https://github.com/intel/intel-extension-for-pytorch/releases) states that it has the supp…
-
### Checklist
- [x] 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.…
-
Hello
The module `awq_ext` can't be load:
![image](https://github.com/user-attachments/assets/449d25dd-064a-4220-8529-2d237d23e0d3)
I'm using *CUDA 11.8*, it's work with: `transformers`, `uns…
-
If I am not mistaken, the awq implemented in ammo uses a default alpha_step = 0.1 to search the parameter. However, the model quantized by ammo have a larger performance reduction than [AWQ](https://g…
-
### System Info
Ubuntu 20.04
NVIDIA A100
nvcr.io/nvidia/tritonserver:24.10-trtllm-python-py3 and 24.07
TensorRT-LLM v0.14.0 and v0.11.0
### Who can help?
@Tracin
### Information
- [x] The offici…
-
Hi,
I tried both the qwen2-vl-7b bf16 & awq and honestly I'm not seeing any speed improvement.
the awq is ~6GB however after running in vLLM it ends up taking the same space in vRAM eventually (~22G…
-
### System Info
x86_64, Debian 11, L4 GPU
### Who can help?
_No response_
### Information
- [ ] The official example scripts
- [ ] My own modified scripts
### Tasks
- [ ] An officially supporte…