pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
83.44k stars 22.52k forks source link

[MPS] `F.interpolate` with `mode="nearest-exact"` differs from expected output #134430

Open hvaara opened 2 months ago

hvaara commented 2 months ago

🐛 Describe the bug

This was discovered while annotating failures in https://github.com/pytorch/pytorch/pull/134184. Original test case is in test/test_nn.py.

There's a diff between the results from MPS and CPU.

The algorithm for correctly computing the expected output in 1d can be seen in https://github.com/pytorch/pytorch/blob/7af38eb98bdceb8fc6f8635ed7dd664ef44e4b10/test/test_nn.py#L9330-L9336 and 2d in https://github.com/pytorch/pytorch/blob/7af38eb98bdceb8fc6f8635ed7dd664ef44e4b10/test/test_nn.py#L9438-L9447

See also https://github.com/pytorch/pytorch/issues/34808.

minimal repro:

import torch
import torch.nn.functional as F

isize, osize = (20, 11)

in_cpu = torch.arange(isize, dtype=torch.float, device='cpu').unsqueeze(0).unsqueeze(0)
in_mps = torch.arange(isize, dtype=torch.float, device='mps').unsqueeze(0).unsqueeze(0)
out_cpu = F.interpolate(in_cpu, size=(osize, ), recompute_scale_factor=False, mode="nearest-exact")
out_mps = F.interpolate(in_mps, size=(osize, ), recompute_scale_factor=False, mode="nearest-exact")

out_cpu - out_mps.cpu()
# tensor([[[0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0.]]])

Versions

PyTorch version: 2.5.0a0+gitc19005d Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A

OS: macOS 14.6.1 (arm64) GCC version: Could not collect Clang version: 15.0.0 (clang-1500.3.9.4) CMake version: version 3.30.1 Libc version: N/A

Python version: 3.12.4 | packaged by conda-forge | (main, Jun 17 2024, 10:13:44) [Clang 16.0.6 ] (64-bit runtime) Python platform: macOS-14.6.1-arm64-arm-64bit Is CUDA available: False CUDA runtime version: No CUDA CUDA_MODULE_LOADING set to: N/A GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True

CPU: Apple M3 Max

Versions of relevant libraries: [pip3] flake8==6.1.0 [pip3] flake8-bugbear==23.3.23 [pip3] flake8-comprehensions==3.15.0 [pip3] flake8-executable==2.1.3 [pip3] flake8-logging-format==0.9.0 [pip3] flake8-pyi==23.3.1 [pip3] flake8-simplify==0.19.3 [pip3] mypy==1.10.0 [pip3] mypy-extensions==1.0.0 [pip3] numpy==2.0.1 [pip3] optree==0.12.1 [pip3] torch==2.5.0a0+gitc19005d [pip3] torch-tb-profiler==0.4.3 [pip3] torchvision==0.20.0a0+0d80848 [pip3] triton==3.0.0 [conda] numpy 2.0.1 pypi_0 pypi [conda] optree 0.12.1 pypi_0 pypi [conda] torch 2.5.0a0+gitc19005d dev_0 [conda] torch-tb-profiler 0.4.3 pypi_0 pypi [conda] torchfix 0.4.0 pypi_0 pypi [conda] torchvision 0.20.0a0+0d80848 dev_0 [conda] triton 3.0.0 pypi_0 pypi

cc @kulinseth @albanD @malfet @DenisVieriu97 @jhavukainen

hvaara commented 2 months ago

@pytorchbot label "module: mps" "module: correctness (silent)"