Closed anijain2305 closed 2 years ago
Here are a couple of failing models that fail with "RuntimeError: expected scalar type Float but found Half":
python benchmarks/torchbench.py --training -d cuda --fast --accuracy-aot-nop --skip-accuracy-check --generate-aot-autograd-stats -k mobilenet_v2_quantized_qat --float16
python benchmarks/torchbench.py --training -d cuda --fast --accuracy-aot-nop --skip-accuracy-check --generate-aot-autograd-stats -k resnet50_quantized_qat --float16
python benchmarks/torchbench.py --training -d cuda --fast --accuracy-aot-nop --skip-accuracy-check --generate-aot-autograd-stats -k mobilenet_v2_quantized_qat --amp
python benchmarks/torchbench.py --training -d cuda --fast --accuracy-aot-nop --skip-accuracy-check --generate-aot-autograd-stats -k resnet50_quantized_qat --amp
@IvanYashchuk All these models fail in the native Pytorch itself. Basically, they can't survive the float16/amp conversion. We never get to the stage where we could run TorchDynamo on them. So, we are skipping these tests in TorchDynamo nightly. This also kinda makes sense as these are quantized models, and some things might be hardcoded.
Since the issue is in PyTorch half/amp conversion, this might not be the best place to track these. So, I suggest to skip them.
Closing in favor of pytorch/pytorch#93777
TorchDynamo Dashboard shows that Aot eager is not 100% for all the models. This is a tracker for the missing work.
float32
float16
No new errors
AMP