F.one_hot() function on "mps" device crashes on Intel Mac

William-Van-BW commented 2 months ago

🐛 Describe the bug

import torch
import torch.nn.functional as F
# Assume num_anchors is the number of anchors, and topk_idxs is your index tensor
num_anchors = 10  # Example number, adjust according to actual situation
topk_idxs = torch.randint(low=0, high=num_anchors, size=(1, 5, 10))  # Randomly generate some index data

# Set the device to MPS
device = torch.device("mps")
topk_idxs = topk_idxs.to(device)

# Perform one-hot encoding and summation operation
try:
    one_hot = F.one_hot(topk_idxs, num_anchors)
    is_in_topk = one_hot.sum(-2)
    print("Operation successful, result shape:", is_in_topk.shape)
except Exception as e:
    print("Operation failed:", str(e))

Result:

/AppleInternal/Library/BuildRoots/4ff29661-3588-11ef-9513-e2437461156c/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Utility/MPSLibrary.mm:556: failed assertion `MPSKernel MTLComputePipelineStateCache unable to load function reduce_multiple_passes_axes_add.
    Compiler encountered an internal error: (null)
'

Versions

(torch_metal) william@wanbinwangdeMacBook-Pro main % pip show torch
Name: torch Version: 2.2.2 Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration Home-page: https://pytorch.org/ Author: PyTorch Team Author-email: packages@pytorch.org License: BSD-3 Location: /opt/anaconda3/envs/torch_metal/lib/python3.11/site-packages Requires: filelock, fsspec, jinja2, networkx, sympy, typing-extensions Required-by: accelerate, torchaudio, torchvision

cc @frank-wei @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @kulinseth @albanD @malfet @DenisVieriu97 @jhavukainen

hvaara commented 2 months ago

I was not able to reproduce with v2.3.0 and at current main branch:

import torch
import torch.nn.functional as F

# Assume num_anchors is the number of anchors, and topk_idxs is your index tensor
num_anchors = 10  # Example number, adjust according to actual situation
topk_idxs = torch.randint(low=0, high=num_anchors, size=(1, 5, 10))  # Randomly generate some index data

# Set the device to MPS
device = torch.device("mps")
topk_idxs = topk_idxs.to(device)

# Perform one-hot encoding and summation operation
try:
    one_hot = F.one_hot(topk_idxs, num_anchors)
    is_in_topk = one_hot.sum(-2)
    print("Operation successful, result shape:", is_in_topk.shape)
except Exception as e:
    print("Operation failed:", str(e))
# Operation successful, result shape: torch.Size([1, 5, 10])

Are you able to reproduce if you update your PyTorch version?

William-Van-BW commented 2 months ago

Hello,

Thank you for your response.

I tried updating to PyTorch version 2.4.0, but I’m still encountering the same issue with the F.one_hot operation on my setup. Here are the details:

Device: MacBook Pro (2019, 16-inch) CPU: Intel Core i9 9800H (8 cores, 16 threads) GPU: AMD Radeon Pro 5500M PyTorch version: 2.4.0 (compiled for Intel x86_64 architecture) OS: macOS 14.6.1 (23G93) The error occurs with the same code snippet you provided:

/AppleInternal/Library/BuildRoots/4ff29661-3588-11ef-9513-e2437461156c/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Utility/MPSLibrary.mm:556: failed assertion `MPSKernel MTLComputePipelineStateCache unable to load function reduce_multiple_passes_axes_add. Compiler encountered an internal error: (null)

It appears this issue still persists on my configuration, even with the latest version of PyTorch. Could you suggest any further troubleshooting steps or potential fixes for this?

Thank you again for your help.

Best regards, William

2024年9月4日 10:22，Roy Hvaara @.***> 写道：

I was not able to reproduce with v2.3.0 and at current main branch:

import torch import torch.nn.functional as F

Assume num_anchors is the number of anchors, and topk_idxs is your index tensor

num_anchors = 10 # Example number, adjust according to actual situation topk_idxs = torch.randint(low=0, high=num_anchors, size=(1, 5, 10)) # Randomly generate some index data

Set the device to MPS

device = torch.device("mps") topk_idxs = topk_idxs.to(device)

Perform one-hot encoding and summation operation

try: one_hot = F.one_hot(topk_idxs, num_anchors) is_in_topk = one_hot.sum(-2) print("Operation successful, result shape:", is_in_topk.shape) except Exception as e: print("Operation failed:", str(e))

Operation successful, result shape: torch.Size([1, 5, 10])

Are you able to reproduce if you update your PyTorch version?

— Reply to this email directly, view it on GitHub https://github.com/pytorch/pytorch/issues/134951#issuecomment-2327779096, or unsubscribe https://github.com/notifications/unsubscribe-auth/A5G6TKHICWHI64KS3H6OLKTZUZVGHAVCNFSM6AAAAABNQKGGDGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRXG43TSMBZGY. You are receiving this because you authored the thread.

malfet commented 2 months ago

I see, so it loos like you are hitting some bug specific to Intel Macs/AMD GPUs, edited the title to reflect that

pytorch / pytorch

F.one_hot() function on "mps" device crashes on Intel Mac #134951

🐛 Describe the bug

Versions

Assume num_anchors is the number of anchors, and topk_idxs is your index tensor

Set the device to MPS

Perform one-hot encoding and summation operation

Operation successful, result shape: torch.Size([1, 5, 10])