Open PenghuiCheng opened 3 months ago
@EikanWang should be evaluating the right way to enable quantization for XPU backend. Even CUDA has the dispatch key, and our philosophy is to align with CUDA, the usage of QuantizedXPU is for PyTorch legacy quantization solution, which we would not follow up. The Tensor with QuantizedXPU dispatch key implies quantization information. But in other quantization solution, we don't need such a Tensor representation, and we put scale and shift in a separate Tensor. Operator API or graph will introduce the scale and shift Tensors. So let me lower the priority of the issue.
🚀 The feature, motivation and pitch
Need quantization support, NotImplementedError: Could not run 'aten::_empty_affine_quantized' with arguments from the 'QuantizedXPU' backend.
cases: test_view_ops_xpu.py::TestOldViewOpsXPU::test_flatten_xpu test_view_ops_xpu.py::TestOldViewOpsXPU::test_ravel_xpu
Alternatives
No response
Additional context
No response