huggingface optimum-quanto issues

huggingface / optimum-quanto

A pytorch quantization backend for optimum

Apache License 2.0

833 stars 61 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Add repr for QuantizedTransformersModel

#357 imba-tjd closed 13 hours ago
0
Clarification on Per-Channel vs. Per-Tensor Quantization for Weights and Activations

#356 kirkdort44 opened 1 day ago
0
QuantizedModelForCausalLM lost repr, cannot see model structure

#355 imba-tjd closed 13 hours ago
1
reload quantized model after saving

#354 lucienfostier opened 5 days ago
1
Is it GPU Compatability Issue?

#353 kamrul-NSL opened 1 week ago
0
task not support: image-text-to-text.

#352 fangyangci opened 1 week ago
0
Its not an issue! Quantization Granularity Question

#351 ClaraLovesFunk opened 1 week ago
0
enable qbitstensor test on xpu

#350 dacorvo closed 1 week ago
0
[tests] enable testing for xpu (rebased)

#349 dacorvo closed 1 week ago
0
Quantized Flux pipeline to cuda?

#348 squewel opened 2 weeks ago
1
Weights Still in FP32 after Quantization

#347 ClaraLovesFunk opened 2 weeks ago
4
How to support activation 4bit quantization?

#346 Ther-nullptr opened 3 weeks ago
1
[tests] enable `test_weight_qbits_tensor_linear_cuda` on xpu devices

#345 faaany closed 1 week ago
1
[tests] enable testing for xpu

#344 faaany closed 1 week ago
5
Only random noise is generated with Flux + LoRA with optimum-quanto >= 0.2.5

#343 nelapetrzelkova opened 3 weeks ago
2
support

#342 werruww closed 3 weeks ago
0
Support QLayerNorm without weights

#341 dacorvo closed 3 weeks ago
0
Inference Speed Slowdown with Static Quantization

#340 ClaraLovesFunk opened 3 weeks ago
2
Error when quantizing flux (already fixed in master)

#339 samedii closed 3 weeks ago
1
fix: use reshape instead of view

#338 dacorvo closed 1 month ago
0
when I use quanto lib to quantize the flux model, the activation layer's quantization on pipe.transformer failed

#337 DaaadShot closed 1 month ago
1
Will module output not be quantized when the model is directly trained after Calibration?

#336 tusiqi1 closed 5 days ago
7
LayerNorm with None weight throws exception

#335 doctorpangloss closed 3 weeks ago
2
Switched linters, black -> ruff

#334 ishandeva closed 1 month ago
2
Add marlin int4 kernel

#333 dacorvo closed 1 month ago
0
Corrupted outputs with Marlin int4 kernels as parallelization increases

#332 dacorvo opened 1 month ago
4
optimum-quanto 0.25 requires ninja but 'pip check flux' reports 'ninja-1.11.1.1 is not supported on this platform'

#331 Davros666 closed 2 weeks ago
2
Add hip support

#330 dacorvo closed 1 month ago
0
Refactor extensions

#329 dacorvo closed 1 month ago
0
Remove overheads in library

#328 dacorvo closed 1 month ago
0
issues with non-contiguous Tensor

#327 bghira closed 1 month ago
1
Fix lumina

#326 dacorvo closed 1 month ago
0
Fix missing call in QuantizedTransformersModel

#325 dacorvo closed 1 month ago
0
Do the Linear layers quantized with W8A8 include output quantization layers?

#324 chenghuaWang closed 1 month ago
9
refactor(library): reduce overhead in marlin op

#323 dacorvo closed 1 month ago
0
mps low-bit kernels from torchao

#322 bghira closed 3 weeks ago
2
Ci move

#321 glegendre01 closed 1 month ago
0
Stricter optimized tensor tests

#320 dacorvo closed 2 months ago
0
Accuracy issue when using torch._int_mm on AMD CPUs

#319 dacorvo opened 2 months ago
5
chore: minimal python version is 3.9

#318 dacorvo closed 2 months ago
0
Refactor AWQ gemm

#317 dacorvo closed 2 months ago
0
More refactoring

#316 dacorvo closed 2 months ago
0
Add marlin int4 kernel

#315 dacorvo closed 1 month ago
1
Refactor QBitsTensor subclasses

#314 dacorvo closed 2 months ago
0
Does AWQ is officially supported now?

#313 lifelongeeek closed 1 month ago
5
qint4 failed for diffusers: QBitsTensor cannot be changed

#312 liyihao1230 opened 2 months ago
2
quantize(model, weights=qint4, activations=qint8) produce weights with dtype = torch.uint8

#311 lifelongeeek closed 2 months ago
2
feat: e4m3fnuz added

#310 dacorvo closed 2 months ago
0
fix(library): disable int_mm for CPU

#309 dacorvo closed 2 months ago
7
feat(examples): add image classification example using quantized vit …

#308 shovan777 closed 2 months ago
2