[Bug]: MPS out of memory error when using cpu device

Email (Optional)

No response

Version

v0.3.8

Which OS(es) are you using?

[X] MacOS
[ ] Windows
[ ] Linux

What happened?

I am attempting to run calculations, such as single point energies, using the CHGNetCalculator ASE calculator on MacOS via GitHub actions: https://github.com/stfc/janus-core/actions/runs/9943461072/job/27469030924?pr=214

However, I get a RuntimeError, which I believe is because CHGnet.load is not passed device / use_device (https://github.com/CederGroupHub/chgnet/blob/main/chgnet/model/dynamics.py#L92).

This means that within CHGnet.load, determine_device initially attempts to load the model to the "mps" device, causing an error as no MPS memory has been allocated.

Two potential solutions are checking the availability of MPS memory, or preferably passing device to CHGnet.load, reducing the need to transfer between devices unnecessarily.

Code snippet

from chgnet.model.dynamics import CHGNetCalculator
CHGNetCalculator(use_device="cpu")

Log output

tests/test_mlip_calculators.py:26: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
janus_core/helpers/mlip_calculators.py:104: in choose_calculator
    calculator = CHGNetCalculator(use_device=device, **kwargs)
../../../Library/Caches/pypoetry/virtualenvs/janus-core-2KE8lRKs-py3.11/lib/python3.11/site-packages/chgnet/model/dynamics.py:91: in __init__
    self.model = (model or CHGNet.load(verbose=False)).to(self.device)
../../../Library/Caches/pypoetry/virtualenvs/janus-core-2KE8lRKs-py3.11/lib/python3.11/site-packages/chgnet/model/model.py:722: in load
    model = model.to(device)
../../../Library/Caches/pypoetry/virtualenvs/janus-core-2KE8lRKs-py3.11/lib/python3.11/site-packages/torch/nn/modules/module.py:1152: in to
    return self._apply(convert)
../../../Library/Caches/pypoetry/virtualenvs/janus-core-2KE8lRKs-py3.11/lib/python3.11/site-packages/torch/nn/modules/module.py:802: in _apply
    module._apply(fn)
../../../Library/Caches/pypoetry/virtualenvs/janus-core-2KE8lRKs-py3.11/lib/python3.11/site-packages/torch/nn/modules/module.py:802: in _apply
    module._apply(fn)
../../../Library/Caches/pypoetry/virtualenvs/janus-core-2KE8lRKs-py3.11/lib/python3.11/site-packages/torch/nn/modules/module.py:825: in _apply
    param_applied = fn(param)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

t = Parameter containing:
tensor([[ -3.4431,  -0.1279,  -2.8300,  -3.4737,  -7.4946,  -8.2354,  -8.1611,
          -8.3861...          -0.3448,  -0.4364,  -0.1661,  -0.3680,  -4.1869,  -8.4233, -10.0467,
         -12.0953, -12.5228, -14.2530]])

    def convert(t):
        if convert_to_format is not None and t.dim() in (4, 5):
            return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None,
                        non_blocking, memory_format=convert_to_format)
>       return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
E       RuntimeError: MPS backend out of memory (MPS allocated: 0 bytes, other allocations: 0 bytes, max allowed: 7.93 GB). Tried to allocate 512 bytes on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).

Code of Conduct

[X] I agree to follow this project's Code of Conduct

CederGroupHub / chgnet

[Bug]: MPS out of memory error when using cpu device #181