ROCm / AMDMIGraphX

AMD's graph optimization engine.
https://rocm.docs.amd.com/projects/AMDMIGraphX/en/latest/
MIT License
183 stars 84 forks source link

Introduce `--int4-weights` option in `migraphx-driver`. This would require changes in MIGraphX's naive quantizer to set range between `[0, 15]` During quantization it should also insert "pack" and "unpack" instructions. #3341

Open lakhinderwalia opened 1 month ago

pfultz2 commented 1 month ago

Since its quantizing the weights we dont need to use our quantizer. Instead we would just take the range of weights and compute scale so it fits in the range of [0, 15]. We also need to insert Q/DQ pairs as well.