Closed jerryzh168 closed 2 weeks ago
Note: Links to docs will display an error until the docs builds have been completed.
As of commit 166353f9a7c57a7357e2aac4bc2950a2b6253492 with merge base cae3d823cec4eb9ad781d9e589f1487e79c9286f ():
* [.github/workflows/build.yml](https://hud.pytorch.org/pr/pytorch/ao/243#9104311941) ([gh](https://github.com/pytorch/ao/actions/runs/9104311941))
This comment was automatically generated by Dr. CI and updates every 15 minutes.
Great :) Let's move AffineQuantizedTensor into dtypes next and create a PyTorch style conversion function? We should also not need to use torch_function to overwrite
linear
, but it makes sense to do it as a follow up because it'll require us to add support for detach, view, addmm, etc. toAffineQuantizedTensor
sounds good. main thing is transpose, we need to think about how to support that with the scales/zero_point and block_size arg
Summary: Currently we have a input_quant_func in the AffineQuantizedTensor, which is a bit convoluted, we want to use a separate LinearActAffineQuantizedTensor subclass for activation quantization (dynamic quantization) instead
also added dispatch for int8act-int8 weight dynamic quantization that's calling
int_scaled_matmul
kernel in the endTest Plan: python test/quantization/test_quant_api.py -k test_quantized_tensor_subclass_8da4w python test/quantization/test_quant_api.py -k test_quantized_tensor_subclass_int8_dyn_quant
Reviewers:
Subscribers:
Tasks:
Tags: