ROCm / AMDMIGraphX

AMD's graph optimization engine.
https://rocm.docs.amd.com/projects/AMDMIGraphX/en/latest/
MIT License
181 stars 82 forks source link

Remove `qlinear_reused` matcher and instead fuse MLIR `quant_dot` with base pointwise operators #3269

Closed CharlieL7 closed 1 month ago

CharlieL7 commented 1 month ago
pfultz2 commented 1 month ago

The problem is that we would now output fp16 instead of int8. We should try to re-enable this matcher. Of course, there is accuracy loss from quantization, but we would have the same issue if we quantized the bias. Perhaps there is a better choice of scales in order to improve the accuracy for these cases.

CharlieL7 commented 1 month ago

Closing, we do pass verify accuracy with MLIR's update and testing with the program mentioned in #2949 and a couple of different random seeds I tried.