Open Wheest opened 1 week ago
@lsy323 is out but he can take a look when he is back.
More context, we are looking to expand torch_ao support in the coming future; appreciate you filing the bug and surfacing use cases and issues observed. @lsy323 to help drive this issue as mentioned earlier.
Thanks! I was also looking at using torch_ao for a larger model, though experienced https://github.com/pytorch/pytorch/issues/140943, where the model didn't get past torch.export
.
🐛 Bug
I'm looking at generating a int8 quantised PyTorch model (both weights and activations at int8), and exporting to StableHLO via
torch-xla
'sexported_program_to_stablehlo
.Right now I'm relatively ambivalent regarding how the model is quantised, as long as I end up with a valid graph with int8 weights and activations (with i32 accumulation types, presumably).
However, there are a few ways to quantise in PyTorch, with various caveats and issues. The furthest I've been able to get is below, in a reproducible script. However, it raises the error:
To Reproduce
Expected behavior
I would expect this to produce a StableHLO graph with int8 tensors in it.
If this can be achieved with a different quantisation method in PyTorch, that also works. The issue here seems to be around this
aten
op.Environment