ROCm / AMDMIGraphX

AMD's graph optimization engine.
https://rocm.docs.amd.com/projects/AMDMIGraphX/en/latest/
MIT License
185 stars 86 forks source link

Quantized distillgpt2 accuracy issue when quant params are inputs #3612

Open shivadbhavsar opened 5 days ago

shivadbhavsar commented 5 days ago

Issue originating from pytorch-quantized model where the scales and zero points are passed as inputs rather than being embedded in the model as literals.

MXR for small problematic block can be found in nas at: /migraphx/models/torch_exports/distilgpt2_block_torch.mxr

to reproduce use: migraphx-driver verify distilgpt2_block_torch.mxr --fill1 arg0_1 --fill0 arg4_1 --fill0 arg10_1 --fill0 arg12_1 --fill0 arg14_1

This verification only fails after #3362