HeliXonProtein / OmegaFold

OmegaFold Release Code
Apache License 2.0
533 stars 75 forks source link

Added supports for Apple Silicon chips with MPS devices optional #10

Closed RuiWang1998 closed 1 year ago

RuiWang1998 commented 1 year ago

Some device-related ops have to change order. Also some ops from torch is not well supported yet so had to rearrange those

S-Shimo commented 1 year ago

Hi @RuiWang1998

I'm interested in the protein structure prediction and found this program.

I try to run omegafold in M2 macbookPro. First, when I run usual way(python main.py INPUT.fasta OUTPUT_DIR), the error "AssertionError: Torch not compiled with CUDA enabled" was occured.

error log(some directory replace to ----): Traceback (most recent call last): File "----/OmegaFold/main.py", line 105, in main() File "----/.pyenv/versions/3.10.5/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "----/Desktop/omega fold/OmegaFold/main.py", line 55, in main model.to(args.device) File "----/.pyenv/versions/3.10.5/lib/python3.10/site-packages/torch/nn/modules/module.py", line 927, in to return self._apply(convert) File "----/.pyenv/versions/3.10.5/lib/python3.10/site-packages/torch/nn/modules/module.py", line 579, in _apply module._apply(fn) File "----/.pyenv/versions/3.10.5/lib/python3.10/site-packages/torch/nn/modules/module.py", line 579, in _apply module._apply(fn) File "----/.pyenv/versions/3.10.5/lib/python3.10/site-packages/torch/nn/modules/module.py", line 602, in _apply param_applied = fn(param) File "----/.pyenv/versions/3.10.5/lib/python3.10/site-packages/torch/nn/modules/module.py", line 925, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) File "----/.pyenv/versions/3.10.5/lib/python3.10/site-packages/torch/cuda/init.py", line 211, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled

So I change args.device to 'mps' (I think this variant is reffered only at line 55 and line 63 of main.py). Then, re-running occur "The operator 'aten::glu.out' is not current implemented for the MPS device" error.

error log: INFO:root:Failed to generate OUTPUT.pdb due to The operator 'aten::glu.out' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable PYTORCH_ENABLE_MPS_FALLBACK=1 to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

I'm sorry that I'm not familiar to Python, to machine learning, to github, and to english. So please request any information to solve these problemes.

my environment is: MacBook Pro(13inch, M2, 2022) memory 24GB

Python3 3.10.5 Pytorch 1.12.0 (about MPS, torch.backends.mps.is_available() is True) Biopython 1.79

RuiWang1998 commented 1 year ago

Hi @S-Shimo,

Thanks for reaching out! To run on CPU, you need to add an argument '--device cpu'.

However, to run on mps, you will need to use this pull request to be able to run as mps support is in its prototype stage and some operators are not supported. Once you checkout this pull request, you can similarly add '--device mps' as an argument and it will run with Metal acceleration.

Or you could wait for this pull request to be merged in a short while!

S-Shimo commented 1 year ago

Thank you very much.

I just retry with merged code.

With "--device cpu", it seems to be run(It takes long time, so calculation is not finished). However, with "--device mps", the same error is occred.

INFO:root:Failed to generate OUTPUT.pdb due to The operator 'aten::glu.out' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variablePYTORCH_ENABLE_MPS_FALLBACK=1to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

RuiWang1998 commented 1 year ago

Hi @S-Shimo ,

Just a sec we'll look into it

RuiWang1998 commented 1 year ago

Hi @S-Shimo,

It seems that we are currently only supporting PyTorch nightly built of torch-1.13.0.dev20220805, which you can install with pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu or conda install pytorch torchvision torchaudio cpuonly -c pytorch-nightly per official site of PyTorch.

As for 1.12, it might take us some more time to figure it out.

Best, Rui

S-Shimo commented 1 year ago

Hi @RuiWang1998

I got it. That probleme depend on my mistake. I have already installed PyTorch ver 1.12, and try to update. But I use pip3 install command without -U option, this is my mistake.

Now I use Pytorch 1.13.0.dev20220806, the different error is occured.

Error: buffer is not large enough.

I googled it, and I found this issue https://github.com/pytorch/pytorch/issues/77851.

Best, Shimo