atomistic-machine-learning / schnetpack

SchNetPack - Deep Neural Networks for Atomistic Systems
Other
789 stars 215 forks source link

Support to MPS Framework #561

Closed viniavila closed 1 year ago

viniavila commented 1 year ago

I know that Schnetpack was designed to run on Linux and isn't intended to support other platforms by default, but I'm trying to run on my Mac M1 with macOS (using micromamba and installed manually from pip, not using the conda-forge package). But I'm having the following issue ():

Error executing job with overrides: ['experiment=qm9_atomwise']
Traceback (most recent call last):
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/bin/spktrain", line 5, in <module>
    cli.train()
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/hydra/main.py", line 94, in decorated_main
    _run_hydra(
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra
    _run_app(
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/hydra/_internal/utils.py", line 457, in _run_app
    run_and_report(
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/hydra/_internal/utils.py", line 223, in run_and_report
    raise ex
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/hydra/_internal/utils.py", line 220, in run_and_report
    return func()
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/hydra/_internal/utils.py", line 458, in <lambda>
    lambda: hydra.run(
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 132, in run
    _ = ret.return_value
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/hydra/core/utils.py", line 260, in return_value
    raise self._return_value
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/hydra/core/utils.py", line 186, in run_job
    ret.return_value = task_function(task_cfg)
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/schnetpack/cli.py", line 158, in train
    trainer.fit(model=task, datamodule=datamodule, ckpt_path=config.run.ckpt_path)
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 529, in fit
    call._call_and_handle_interrupt(
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py", line 42, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 568, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 949, in _run
    self.strategy.setup(self)
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/pytorch_lightning/strategies/single_device.py", line 74, in setup
    self.model_to_device()
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/pytorch_lightning/strategies/single_device.py", line 71, in model_to_device
    self.model.to(self.root_device)
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/lightning_fabric/utilities/device_dtype_mixin.py", line 54, in to
    return super().to(*args, **kwargs)
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1145, in to
    return self._apply(convert)
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/torch/nn/modules/module.py", line 844, in _apply
    self._buffers[key] = fn(buf)
  File "/Users/vinicius/.local/opt/micromamba/envs/ufabc/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1143, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.

Steps to reproduce:

  1. Install schnetpack manual as described in README.md
  2. Run export HYDRA_FULL_ERROR=1
  3. Run mk spk_workdir && cd spk_workdir
  4. Run spktrain experiment=qm9_atomwise
jnsLs commented 1 year ago

Hi @viniavila , have you tried using cpu only? Or if that does not help using float32 tensors only? Both options are far from optimal, but it would help for debugging. Also, it seems like this issue might be a more general one and not necessarily related to schnetpack: https://github.com/facebookresearch/segment-anything/issues/94 Best, Jonas

viniavila commented 1 year ago

Hi @jnsLs!! How can I change for cpu only? Is there a setting that I can pass in command line or I have to deal with the code or config file? Thanks for the response.

jnsLs commented 1 year ago

spktrain experiment=qm9_atomwise trainer.accelerator=cpu should do the job

viniavila commented 1 year ago

Thanks!