Inference on M1 processor with device = MPS : RuntimeError: Placeholder storage has not been allocated on MPS device

iliesaya commented 2 years ago

Prerequisite

[X] I have searched the existing and past issues but cannot get the expected help.
[X] I have read the FAQ documentation but cannot get the expected help.
[X] The bug has not been fixed in the latest version.

🐞 Describe the bug

Hi, I am trying to run mmdetection inference with MPS acceleration on my macbook m1. It fail with this error :

    this_res = inference_detector(model, img)
  File "/Users/aya/miniforge3/envs/torch-gpu/lib/python3.8/site-packages/mmdet/apis/inference.py", line 151, in inference_detector
    results = model(return_loss=False, rescale=True, **data)
  File "/Users/aya/miniforge3/envs/torch-gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/aya/git/mmcv/mmcv/runner/fp16_utils.py", line 116, in new_func
    return old_func(*args, **kwargs)
  File "/Users/aya/miniforge3/envs/torch-gpu/lib/python3.8/site-packages/mmdet/models/detectors/base.py", line 174, in forward
    return self.forward_test(img, img_metas, **kwargs)
  File "/Users/aya/miniforge3/envs/torch-gpu/lib/python3.8/site-packages/mmdet/models/detectors/base.py", line 147, in forward_test
    return self.simple_test(imgs[0], img_metas[0], **kwargs)
  File "/Users/aya/miniforge3/envs/torch-gpu/lib/python3.8/site-packages/mmdet/models/detectors/two_stage.py", line 177, in simple_test
    x = self.extract_feat(img)
  File "/Users/aya/miniforge3/envs/torch-gpu/lib/python3.8/site-packages/mmdet/models/detectors/two_stage.py", line 67, in extract_feat
    x = self.backbone(img)
  File "/Users/aya/miniforge3/envs/torch-gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/aya/miniforge3/envs/torch-gpu/lib/python3.8/site-packages/mmdet/models/backbones/detectors_resnet.py", line 331, in forward
    outs = list(super(DetectoRS_ResNet, self).forward(x))
  File "/Users/aya/miniforge3/envs/torch-gpu/lib/python3.8/site-packages/mmdet/models/backbones/resnet.py", line 637, in forward
    x = self.norm1(x)
  File "/Users/aya/miniforge3/envs/torch-gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/aya/miniforge3/envs/torch-gpu/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 171, in forward
    return F.batch_norm(
  File "/Users/aya/miniforge3/envs/torch-gpu/lib/python3.8/site-packages/torch/nn/functional.py", line 2446, in batch_norm
    return torch.batch_norm(
RuntimeError: Placeholder storage has not been allocated on MPS device!

The same code with device='CPU' work fine , but is very slow (7 sec / image) and on another computer with an nvidia gpu and device='cuda:0' also works fine (0.5 sec / image).

device = 'mps'
config = mmcv.Config.fromfile(configPath)
config.model.pretrained = None
model = build_detector(config.model)
checkpoint = load_checkpoint(model, checkpointPath, map_location=device)
model.CLASSES = checkpoint['meta']['CLASSES']
model.cfg = config
model.to(device)
model.eval()
...
        for img in dataset:
            this_res = inference_detector(model, img)

Any idea ?

thanks

Environment

Macbook Pro with M1 processor pytorch-nightly = pytorch-1.13.0.dev20 MMCV = V1.6.1 MMDetection = V2.25.2

Additional information

No response

lixuekai2001 commented 1 year ago

I am trying to install the MMDetection on MBP M1 Pro chip but not success. The main error message is:

  1 error generated.
  error: command '/usr/bin/clang' failed with exit code 1
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: legacy-install-failure

× Encountered error while trying to install package. ╰─> mmcv-full

note: This is an issue with the package mentioned above, not pip. hint: See above for output from the failure.

Does anyone have the same problem? And how to solve it?

Czm369 commented 1 year ago

Maybe you can raise an issue in MMDeploy, they have more experience in deployment https://github.com/open-mmlab/mmdeploy

ashim-mahara commented 1 year ago

@iliesaya did you move the img to mps? I faced the same issue when I didn't move my input to the GPU.

iliesaya commented 1 year ago

@iliesaya did you move the img to mps? I faced the same issue when I didn't move my input to the GPU.

I don't think so, could you show me how ?

ashim-mahara commented 1 year ago

for img in dataset:
    img = img.to('mps') ## or img.to(device) since you have already set device to mps in the previous line
    this_res = inference_detector(model, img)

If you come across an error stating that operation like atten::cumsum.out is not implemented for mps backend then it's not possible to run the model with M1 without adding the functionality yourself. The issue for this is tracked at https://github.com/pytorch/pytorch/issues/77764.

Hope this helps. @iliesaya

iliesaya commented 1 year ago

for img in dataset:
    img = img.to('mps') ## or img.to(device) since you have already set device to mps in the previous line
    this_res = inference_detector(model, img)
If you come across an error stating that operation like atten::cumsum.out is not implemented for mps backend then it's not possible to run the model with M1 without adding the functionality yourself. The issue for this is tracked at pytorch/pytorch#77764.

Hope this helps. @iliesaya

img is a numpy array :

 img = img.to(device)

AttributeError: 'numpy.ndarray' object has no attribute 'to'

if I convert it to a tensor i have this error:

            t = torch.from_numpy(img)
            t = t.to(device)
            this_res = inference_detector(model, t)

TypeError: expected str, bytes or os.PathLike object, not Tensor

ashim-mahara commented 1 year ago

I don't know what you're doing inside inference_detector() but if it's like

    if next(model.parameters()).is_cuda:
        # scatter to specified GPU
        data = scatter(data, [device])[0]
    else:
        for m in model.modules():
            assert not isinstance(
                m, RoIPool
            ), 'CPU inference with RoIPool is not supported currently.'

from https://github.com/open-mmlab/mmdetection/blob/master/mmdet/apis/inference.py then you would have to check if the device is mps and if it is then relocate the data into the device.

data = scatter(data, [device])[0] -> data = scatter(data, ['mps'])[0] should work but be sure include a block to check if the model and data are both in mps.

Goodluck.

apatsekin commented 1 year ago

data = scatter(data, [device])[0] -> data = scatter(data, ['mps'])[0] should work but be sure include a block to check if the model and data are both in mps.

Goodluck.

it throws:

AttributeError: module 'torch._C' has no attribute '_scatter'

open-mmlab / mmdetection