nod-ai / SHARK-ModelDev

Unified compiler/runtime for interfacing with PyTorch Dynamo.
Apache License 2.0
95 stars 48 forks source link

torch.aten.quantize_per_tensor to linalg #683

Open AmosLewis opened 6 months ago

AmosLewis commented 6 months ago

https://github.com/nod-ai/SHARK-TestSuite/issues/182

After Integrate torch-mlir@ec6d7aa onnx.resize op https://github.com/iree-org/iree/pull/17358: @zjgarvey looks like quantize related error, it would be better you look at this. python ./run.py --torchmlirbuild ../../torch-mlir/build --tolerance 0.001 0.001 --cachedir ./huggingface_cache --ireebuild ../../iree-build -f onnx -g models --mode onnx --report --tests onnx/models/RAFT_vaiq_int8

failed to translate executables
failed to translate executables
failed to translate executables
RAFT_vaiq_int8.default.onnx.torch.mlir:1644:13: error: 'func.func' op exceeded stack allocation limit of 32768 bytes for function. Got 401408 bytes
    %1601 = torch.aten.quantize_per_tensor %1596, %float5.000000e-01, %int0, %int12 : !torch.vtensor<[1024,7,7,1],f32>, !torch.float, !torch.int, !torch.int -> !torch.vtensor<[1024,7,7,1],!torch.qint8>
            ^
RAFT_vaiq_int8.default.onnx.torch.mlir:1644:13: note: called from
    %1601 = torch.aten.quantize_per_tensor %1596, %float5.000000e-01, %int0, %int12 : !torch.vtensor<[1024,7,7,1],f32>, !torch.float, !torch.int, !torch.int -> !torch.vtensor<[1024,7,7,1],!torch.qint8>
            ^
AmosLewis commented 5 months ago

@IanWood1 https://github.com/iree-org/iree/pull/17574

IanWood1 commented 4 months ago

https://github.com/iree-org/iree/pull/17692 fixes iree-compile issues and https://github.com/nod-ai/SHARK-TestSuite/pull/254 fixes an issue with the test suite (erroring before inference)