how can id run this project on Mac OS M1, use the GPU via torch "mps"

mcj2761358 commented 1 year ago

how can id run this project on Mac OS M1, use the GPU via torch mps

LittleTerry commented 1 year ago

same question here

foundVanting commented 1 year ago

+1

jdbbjd commented 1 year ago

Have you solved the problem

HotWordland commented 1 year ago

+1

yosefl20 commented 1 year ago

I did everything as I can to port the code to mps. Sadly, mps does not implement ten::_fft_r2c at PyTorch 2.0.1. And I got this from console:

NotImplementedError: The operator 'aten::_fft_r2c' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable 'PYTORCH_ENABLE_MPS_FALLBACK=1' to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

aviadavi commented 1 year ago

@yosefl20 thanks for you insights. i'm unable to get it run on my M1 CPU. I set this env variable to 1 ..but still gets the error

(video_retalking) a@Mac-a video-retalking % python3 inference.py \              
  --face examples/face.mp4 \
  --audio examples/audio.m4a \
  --outfile results/1_1.mp4
/opt/homebrew/Caskroom/miniconda/base/envs/video_retalking/lib/python3.8/site-packages/torchvision/transforms/functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be **removed in 0.17**. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional.
  warnings.warn(
[Info] Using cpu for inference.
[Step 0] Number of frames available for inference: 377
[Step 1] Landmarks Extraction in Video.
Traceback (most recent call last):
  File "inference.py", line 345, in <module>
    main()
  File "inference.py", line 81, in main
    kp_extractor = KeypointExtractor()
  File "/Users/a/Git/video-retalking/third_part/face3d/extract_kp_videos.py", line 16, in __init__
    self.detector = face_alignment.FaceAlignment(face_alignment.LandmarksType._2D)   
  File "/opt/homebrew/Caskroom/miniconda/base/envs/video_retalking/lib/python3.8/site-packages/face_alignment/api.py", line 77, in __init__
    self.face_detector = face_detector_module.FaceDetector(device=device, verbose=verbose, **face_detector_kwargs)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/video_retalking/lib/python3.8/site-packages/face_alignment/detection/sfd/sfd_detector.py", line 31, in __init__
    self.face_detector.to(device)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/video_retalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1145, in to
    return self._apply(convert)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/video_retalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/video_retalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 820, in _apply
    param_applied = fn(param)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/video_retalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1143, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/video_retalking/lib/python3.8/site-packages/torch/cuda/__init__.py", line 239, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")

any idea how to workaround this?

mertuner commented 10 months ago

I was able to run this on mps. Still slow but significantly better than CPU. Here is the 5 second video result.

MacBook Air 2020 M1/16GB RAM/8 Core GPU

MPS ===> 29min 36sec CPU ===> 50min 4sec

Let me know how long yours take. And if you want to try yourself on mps:

1-) face_parsing.py ---> line 44 to --->

im = cv2.resize(im.astype((np.float32)), (self.size, self.size))

2-) inference.py --->

def main():    
    # adding gpu support on Apple M1/M2/M3 Chip Devices
    if not torch.backends.mps.is_available():
        if not torch.backends.mps.is_built():
            print("MPS not available because the current PyTorch install was not "
              "built with MPS enabled.")
        else:
            print("MPS not available because the current MacOS version is not 12.3+ "
              "and/or you do not have an MPS-enabled device on this machine.")
    elif torch.cuda.is_available():
        device = 'cuda'
    elif torch.backends.mps.is_available():
        device = torch.device("mps")
    else:
        device = 'cpu'

...rest

3-)

conda env config vars set PYTORCH_ENABLE_MPS_FALLBACK=1
conda activate <video_retalking>

We do step 3 because there is this one operation is yet to be implemented using mps.

It should work.

Cheers.

Matrix-X commented 7 months ago

I was able to run this on mps. Still slow but significantly better than CPU. Here is the 5 second video result.

MacBook Air 2020 M1/16GB RAM/8 Core GPU

MPS ===> 29min 36sec CPU ===> 50min 4sec

Let me know how long yours take. And if you want to try yourself on mps:

1-) face_parsing.py ---> line 44 to --->

im = cv2.resize(im.astype((np.float32)), (self.size, self.size))

2-) inference.py --->
def main():    
    # adding gpu support on Apple M1/M2/M3 Chip Devices
    if not torch.backends.mps.is_available():
        if not torch.backends.mps.is_built():
            print("MPS not available because the current PyTorch install was not "
              "built with MPS enabled.")
        else:
            print("MPS not available because the current MacOS version is not 12.3+ "
              "and/or you do not have an MPS-enabled device on this machine.")
    elif torch.cuda.is_available():
        device = 'cuda'
    elif torch.backends.mps.is_available():
        device = torch.device("mps")
    else:
        device = 'cpu'

...rest
3-)
conda env config vars set PYTORCH_ENABLE_MPS_FALLBACK=1
conda activate <video_retalking>
We do step 3 because there is this one operation is yet to be implemented using mps.

It should work.

Cheers.

Thank you for the solution, but I have installed PyTorch 1.9.0, the torch.backends does have "mps", which PyTorch version do you use and does it work for this project , for the guide says , to install PyTorch 1.9.0

OpenTalker / video-retalking

how can id run this project on Mac OS M1, use the GPU via torch "mps" #30