pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
83.08k stars 22.41k forks source link

MPS runtime error when indexing from list / tensor #81051

Closed thipokKub closed 2 years ago

thipokKub commented 2 years ago

🐛 Describe the bug

Code

import torch
print(torch.arange(9).reshape(3, 3).to("cpu")[[1, 2], [2, 1]])
print(torch.arange(9).reshape(3, 3).to("mps")[[1, 2], [2, 1]])

Output

tensor([5, 7])
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

Expected the result to be the same (but on different device)

Versions

PyTorch version: 1.13.0.dev20220707 Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A

OS: macOS 12.4 (arm64) GCC version: Could not collect Clang version: 13.1.6 (clang-1316.0.21.2.5) CMake version: version 3.22.2 Libc version: N/A

Python version: 3.9.13 | packaged by conda-forge | (main, May 27 2022, 17:00:33) [Clang 13.0.1 ] (64-bit runtime) Python platform: macOS-12.4-arm64-arm-64bit Is CUDA available: False CUDA runtime version: No CUDA GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True

Versions of relevant libraries: [pip3] numpy==1.22.4 [pip3] pytorch-lightning==1.6.3 [pip3] pytorch-lightning-bolts==0.3.2.post1 [pip3] pytorch-metric-learning==1.3.0 [pip3] torch==1.13.0.dev20220707 [pip3] torchaudio==0.14.0.dev20220603 [pip3] torchinfo==1.6.6 [pip3] torchmetrics==0.8.2 [pip3] torchvision==0.14.0a0+f9f721d [conda] numpy 1.22.4 pypi_0 pypi [conda] pytorch-lightning 1.6.3 pypi_0 pypi [conda] pytorch-lightning-bolts 0.3.2.post1 pypi_0 pypi [conda] pytorch-metric-learning 1.3.0 pypi_0 pypi [conda] torch 1.13.0.dev20220707 pypi_0 pypi [conda] torchaudio 0.14.0.dev20220603 pypi_0 pypi [conda] torchinfo 1.6.6 pypi_0 pypi [conda] torchmetrics 0.8.2 pypi_0 pypi [conda] torchvision 0.14.0a0+f9f721d pypi_0 pypi

cc @kulinseth @albanD

qqaatw commented 2 years ago

Are you using PYTORCH_ENABLE_MPS_FALLBACK=1? The cpu fallback seems to have a bug in handling indices arguments.

It will be good after aten::index.Tensor_out gets implemented, you can track it here #77764.

thipokKub commented 2 years ago

Yep

Without

tensor([5, 7])
NotImplementedError: The operator 'aten::index.Tensor_out' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

With PYTORCH_ENABLE_MPS_FALLBACK=1

tensor([5, 7])
UserWarning: The operator 'aten::index.Tensor_out' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.)
  print(torch.arange(9).reshape(3, 3).to("mps")[[1, 2], [2, 1]])
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)
kulinseth commented 2 years ago

Yep

Without

tensor([5, 7])
NotImplementedError: The operator 'aten::index.Tensor_out' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

With PYTORCH_ENABLE_MPS_FALLBACK=1

tensor([5, 7])
UserWarning: The operator 'aten::index.Tensor_out' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.)
  print(torch.arange(9).reshape(3, 3).to("mps")[[1, 2], [2, 1]])
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

@thipokKub We have the implementation of aten::index.Tensor_out done, should be up for review soon. It should address this issue. The indexing ops with Fallback have limited support so we will need to go ahead and enable those anyway.

nicolaschapados commented 2 years ago

@thipokKub We have the implementation of aten::index.Tensor_out done, should be up for review soon. It should address this issue. The indexing ops with Fallback have limited support so we will need to go ahead and enable those anyway.

Do you know when it should appear in the nightly, @kulinseth ?

kulinseth commented 2 years ago

@thipokKub We have the implementation of aten::index.Tensor_out done, should be up for review soon. It should address this issue. The indexing ops with Fallback have limited support so we will need to go ahead and enable those anyway.

Do you know when it should appear in the nightly, @kulinseth ?

We are in the process of testing it, will update the issue, when we have the PR up.

kulinseth commented 2 years ago

The index.Tensor_out is enabled on the MPS backend. Please take a look at latest nightly.

thipokKub commented 2 years ago

As of 1.13.0.dev20220822 the issue is fixed. Thanks for all the hard work

hamdimina commented 1 year ago

As of 1.13.0.dev20220822 the issue is fixed. Thanks for all the hard work

how do you fix it? can you help me please