issues
search
huggingface
/
optimum-quanto
A pytorch quantization backend for optimum
Apache License 2.0
639
stars
34
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Extension lifecycle
#223
dacorvo
closed
2 days ago
0
Refactor linear dispatch to use new torch kernels
#222
dacorvo
closed
2 days ago
2
Incompatibility with `torch.compile()`
#221
sanchit-gandhi
opened
3 days ago
1
VLLM Supported?
#220
RanchiZhao
opened
5 days ago
1
Add `torch.ops.aten._weight_int8pack_mm` for W8A16 inference
#219
dacorvo
closed
2 days ago
0
Use `torch.ops.aten._weight_int4pack_mm` for W4A16 inference
#218
dacorvo
opened
6 days ago
1
Inference from a reload quantized open clip model (by .load_state_dict) resulted in IndexError
#217
kechan
opened
6 days ago
2
Accuracy took a big hit with activation=qint8 for an open clip model
#216
kechan
opened
1 week ago
1
Should we stop using quanto without the optimum?
#215
kechan
opened
1 week ago
1
CUDA Kernel
#214
satabios
closed
6 days ago
1
KeyError: 'lm_head.weight_qtype' when loading the quanto model
#213
RanchiZhao
closed
1 week ago
5
RuntimeError: derivative for dequantize is not implemented
#212
Eugene29
closed
1 week ago
2
Enable security checks
#211
mfuntowicz
closed
2 weeks ago
0
Add owlv2 detection example
#210
dacorvo
closed
2 weeks ago
0
feat(cuda): compile according to capabilities
#209
dacorvo
closed
2 weeks ago
0
Only use optimized CUDA kernels if arch is at least sm80
#208
dacorvo
closed
3 weeks ago
0
quanto_cuda.so: cannot open shared object file: No such file or directory
#207
nuclear-missile
closed
3 weeks ago
0
Verify extension behaviour in google Colab
#206
dacorvo
opened
1 month ago
3
Convert quanto to optimum-quanto
#205
dacorvo
closed
1 month ago
0
docs: explicitly reference optimum
#204
dacorvo
closed
1 month ago
0
optimized kernel for quanto::dqmm not found
#203
kechan
opened
1 month ago
3
Quantized CLIPModel inference not noticeably faster (or even slower) than non quantized
#202
kechan
closed
1 month ago
8
Fix benchmark links
#201
SunMarc
closed
1 month ago
0
Update benchmark
#200
dacorvo
closed
1 month ago
0
Update README.md
#199
dacorvo
closed
1 month ago
0
Add latest AWQ CUDA fp16 int4 kernels
#198
dacorvo
closed
1 month ago
0
Prepare for gemm kernels
#197
dacorvo
closed
1 month ago
0
Silence ruff warnings
#196
dacorvo
closed
1 month ago
0
Clarify dispatch
#195
dacorvo
closed
1 month ago
0
Force a recompilation of the extensions when upgrading pytorch
#194
dacorvo
closed
2 days ago
1
Yet another tensor refactoring
#193
dacorvo
closed
1 month ago
0
Unable quantize a single linear layer: throws error: ValueError: Cannot quantize Tensor of shape torch.Size([1, 10]) along axis 0 of size 1
#192
rajat-008
closed
2 weeks ago
5
Announce migration to optimum
#191
dacorvo
closed
1 month ago
1
[Feature Request] INT16 🤗
#190
duanshengliu
closed
2 months ago
2
[Feature Request] FP6 🤗
#189
NicolasMejiaPetit
closed
3 weeks ago
2
ValueError: The model is quantized with QuantizationMethod.QUANTO and is not serializable - check out the warnings from the logger on the traceback to understand the reason why the quantized model is not serializable.
#188
gospacedev
closed
2 months ago
3
Avoid composite gradients in quantized linear function
#187
dacorvo
closed
2 months ago
0
Switch to ruff native formatter
#186
dacorvo
opened
2 months ago
4
build: fix pyproject.toml
#185
baggiponte
closed
1 month ago
5
Why the quantized net is slower?
#184
theguardsgod
closed
2 weeks ago
3
Got stuck when train resnet50 with QAT
#183
catsled
closed
1 month ago
7
Can I use quanto on AMD GPU?
#182
catsled
closed
1 month ago
4
feat(example): add quantize stablediffusion example
#181
thliang01
closed
2 months ago
1
Potential readme issue - falls back to original dtype, not fp32
#180
calmitchell617
closed
1 month ago
3
Small refactoring in groups
#179
dacorvo
closed
2 months ago
0
ci: allow manual stale bot
#178
dacorvo
closed
2 months ago
0
ci: add permissions to stale bot
#177
dacorvo
closed
2 months ago
0
1.58 bit quantization
#176
leo-gan
closed
1 month ago
1
AttributeError: 'str' object has no attribute 'detach'
#175
Gooddz1
closed
2 months ago
5
Add stale bot
#174
dacorvo
closed
2 months ago
0
Next