Closed weiji14 closed 4 months ago
Test on macos-14 failing at https://github.com/Clay-foundation/model/actions/runs/8042592659/job/21963448881#step:4:79:
Main error message is RuntimeError: MPS backend out of memory (MPS allocated: 0 bytes, other allocations: 0 bytes, max allowed: 7.93 GB)
. Might need to try setting PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0
as suggested, or xfail test_model.py
on macos-14.
@leothomas, if you have time, could you try installing from the environment.yml
/conda-lock.yml
file in this branch on your macOS M1 computer, and see if the docs/partial-inputs.ipynb
notebook works? The torchvision
/torchdata
conda-forge incompatibility seems to have gone away today after https://github.com/conda-forge/torchvision-feedstock/pull/89.
Did a fresh install with micromamba and was able to run the test without errors on this branch - although I just realized that I have a Mac M2 and not M1 (not sure if that made an important difference)
Did a fresh install with micromamba and was able to run the test without errors on this branch - although I just realized that I have a Mac M2 and not M1 (not sure if that made an important difference)
Cool, M2 should be fine too (probably more memory than M1). I tried setting PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0
in the CI but it still fails. Looking at https://github.com/actions/runner-images/issues/9254#issuecomment-1936326374 and https://discuss.pytorch.org/t/mps-back-end-out-of-memory-on-github-action/189773/2, it seems like the GitHub Actions runners don't have access to the underlying Metal Performance Shaders (MPS) hardware unfortunately, so we might need to fallback to using CPU on the macos-14
CI.
Looks good to me. I'm approving, but I don't know if you want to wait for @leothomas to test using his M1 Mac, since mine is pre-M1.
Thanks @chuckwondo for reviewing, I'll merge this in first so that a Mac M1 user can get the install working on their device (https://github.com/Clay-foundation/model/issues/161#issuecomment-2002602847), and will let Leo test things later once the changes here get incoporated into #166 as mentioned at https://github.com/Clay-foundation/model/pull/164#discussion_r1527612297
Support installation on macOS ARM64 devices (M1 chips) by:
osx-arm64
as another platform in theenvironment.yml
fileconda-lock lock --mamba --file environment.yml --with-cuda=12.0
macos-14
to the.github/workflows/test.yml
GitHub Actions workflowReferences:
Addresses https://github.com/Clay-foundation/model/issues/161 and extends #162.