awq Search Results - Githubissues

1000+ results
for awq

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

casper-hansen/AutoAWQ #657

probability tensor contains either inf, nan or element < 0

Hi Im trying to do inference on a awq quantized model and im constantly getting this error when trying to generate text. Im using Qwen2.5-72B-Instruct-AWQ. Some code to give context: sel…

alvaropastor7 updated 4 days ago
1
triton-inference-server/tensorrtllm_backend #566

Build Qwen2-72B model to INT4-AWQ TensorRT engines failed

### System Info - Ubuntu 20.04 - NVIDIA A100 ### Who can help? @Tracin @kaiyux ### Information - [X] The official example scripts - [ ] My own modified scripts ### Tasks - [ ] A…

wangpeilin updated 2 weeks ago
1
vllm-project/vllm #6142

[Feature]: deepseek-v2 awq support

### 🚀 The feature, motivation and pitch Is the deepseek-v2 AWQ version supported now? When I run it, I get the following error: ``` [rank0]: File "/usr/local/lib/python3.9/dist-packages/vllm/mo…

fengyang95 updated 1 month ago
9
xorbitsai/inference #2554

最新版本的xinference无法正常启动qwen2-vl-instruct模型

### System Info / 系統信息 cuda 12.2，centos7 ### Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？ - [X] docker / docker - [ ] pip install / 通过 pip install 安装 - [ ] installation from source …

majestichou updated 6 days ago
9
AutoGPTQ/AutoGPTQ #745

[BUG] 支持4bit量化ChatGLM-4-9B-Chat 和 ChatGLM3-6B 这两个模型吗？

--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[2], [line 2](vscode-notebook-cell:?exe…

shawn9977 updated 3 weeks ago
1
NVIDIA/TensorRT-LLM #2300

Performance of W4A8 throughput on Hopper GPU.

## System Info Intel(R) Xeon(R) Platinum 8468 NVIDIA H800-80G TensorRT-LLM version 0.12.0 ## Who can help? @Tracin @byshiue ## Reproduction I followed the official procedure for LLama2 7b quantiza…

zkf331 updated 3 weeks ago
1
QwenLM/Qwen2.5-Coder #179

I deploy a model using vLLM, I found that in the benchmark,…

I deployed the BF16 and INT8 versions of Qwen2.5-coder-32b-instruct using vLLM (version 0.6.1) and evaluated them with OpenCompass. Surprisingly, BF16 underperformed compared to INT8 on several metric…

endNone updated 6 days ago
4
1rgs/jsonformer #44

AWQ RuntimeError

Is there a way to make it work with AWQ models? Output: ``` Fetching 14 files: 100%|███████████████████████████| 14/14 [00:00

tom-doerr updated 10 months ago
2
vllm-project/vllm #10713

[Usage]: vllm infer with 2 * Nvidia-L20, output repeat !!!!

### Your current environment vllm==0.6.1 ### How would you like to use vllm Steps to reproduce This happens to Qwen2.5-32B-Instruct-AWQ The problem can be reproduced with the following steps: …

RoyaltyLJW updated 3 days ago
1
MeetKai/functionary #259

Error using Functionary-small-v3.2 AWQ version with vLLM

Hello Functionary team, I'm trying to use the Functionary-small-v3.2 AWQ version with vLLM for inference, but I'm encountering an error. The vLLM library doesn't seem to recognize the 'FunctionaryF…

MadanMaram updated 1 month ago
5

上一页 1...3 4 5 6 7 8 9...100 下一页

1000+ results for awq

1000+ results
for awq