auto-quant Search Results

1000+ results
for auto-quant

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/transformers #32420

Recent changes is causing "found at least two devices"

### System Info transformers 4.43.3, python 3.10, linux ### Who can help? @ArthurZucker ### Information - [ ] The official example scripts - [X] My own modified scripts ### Tasks - [ ] An offi…

casper-hansen updated 3 days ago
9
xorbitsai/inference #2210

vllm 启动qwen2-gptq时报错Server error: 400 - [address=0.0.0.0:461…

### System Info / 系統信息乌班图v24，docker启动 2080ti*4 cuda12.6 Driver Version: 560.31.02 ### Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？ - [X] docker / docker - [ ] pip install / 通过 p…

Acc1143 updated 6 days ago
4
intel/neural-compressor #1580

PostTrainingQuantConfig(quant_level='auto', device='npu', ba…

The below PostTrainingQuantConfig produces fp32 ops for NPU using 2.4.1. Models with int8 and fp16 ops would be preferred for NPU. conf=PostTrainingQuantConfig(quant_level='auto', device='n…

kleiti updated 7 months ago
1
AutoGPTQ/AutoGPTQ #670

[BUG] Following the quant_with_alpaca.py example but keep ge…

**Describe the bug** I am using the `quant_with_alpaca.py` script to quantize MaziyarPanahi/Llama-3-70B-Instruct-32k-v0.1. I am using the following command: ``` python python quant_with_alpaca.p…

murtaza-nasir updated 3 months ago
2
OCA/stock-logistics-workflow #1421

Migration to version 17.0

# Todo https://github.com/OCA/maintainer-tools/wiki/Migration-to-version-17.0 # Modules to migrate - [x] delivery_procurement_group_carrier - By @peluko00 - #1570 - [x] purchase_stock_picki…

OCA-git-bot updated 1 hour ago
27
QwenLM/Qwen2-VL #181

Strange behavior in multi turn video chatting

``` from transformers import Qwen2VLForConditionalGeneration, AutoTokenizer, AutoProcessor from qwen_vl_utils import process_vision_info from transformers import Qwen2VLProcessor from awq.models.q…

psych0v0yager updated 2 weeks ago
2
neuralmagic/AutoFP8 #36

CUDA out of memory when quantizing llama3.1-405b on 80GiBx8 …

``` Some parameters are on the meta device device because they were offloaded to the cpu. Quantizing weights: 0%| | 0/1771 [00:00

sfc-gh-zhwang updated 1 month ago
2
quic/aimet #2655

TypeError: descriptor 'masked_fill_' for 'torch._C._TensorBa…

When I use aimet autoquant to quant my model, I met the following issues: - Prepare Model Traceback (most recent call last): File "/workspace/aimet/build/staging/universal/lib/python/aimet_torch/…

Francis235 updated 8 months ago
2
PaddlePaddle/PaddleSlim #1757

自动压缩报错CUDNN版本错误

系统ubuntu20.01。paddle和slim均是dev版本，CUDA11.6,cudnn8.4,按照官方说明这个是匹配的，但是执行自动压缩的时候，还是报版本不匹配，这是咋回事？ 2023-06-01 15:00:54,113-INFO: devices: gpu 2023-06-01 15:01:03,250-INFO: Selected strategies: ['qat_dis'] …

truthsun22 updated 7 months ago
2
oobabooga/text-generation-webui #6028

Full use of dual GPU

Hi guys! Love your work but I bought another GPU recently for the release of Llama 3 and hit a wall. **Description** Like a lot of small wallet devs, I have a dual RTX 3090. I was expecting to u…

Skit5 updated 1 month ago
6

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for auto-quant

1000+ results
for auto-quant