Closed GUANGZChen closed 3 weeks ago
I use cpu-only pytorch all the time, so the problem is not inherent to mace. Could it be that you saved a cuda model instead of the cpu version (mace_run_train --save_cpu
, I believe)?
Thanks. I think it might be the problem since I could load mace_mp on my device but not the trained mace potential.
With a tiny bit of torch you can, but only on a GPU machine, load the model and save it as a cpu model. There may be some script in mace that can do that already, but if not, I think it's be a nearly trivial but useful addition.
With a tiny bit of torch you can, but only on a GPU machine, load the model and save it as a cpu model. There may be some script in mace that can do that already, but if not, I think it's be a nearly trivial but useful addition.
Hi, Bernstei, Thank you for your reply. Could you please let me know which code I should use to change the output format for the model?
Hey @GUANGZChen, You need to use --save_cpu in your input script.
@ilyes - would you be interested in a PR that adds a gpu_to_cpu script? So people don't have to get into the torch code, or rerun their fit (even if it's fast, because it's from a checkpoint)?
Yes that would be useful as we keep copying it to people. I really need to change the default in main, that's top of my list.
Do you want it branched from develop or main?
@ilyes319 The patch is ready - it's nearly trivial, maybe 10-15 lines of code, including all the argument parsing overhead. I just need to know what branch to create the PR relative to
Thanks! The develop branch please
Oops - just pushed into develop by mistake. Do you want me to revert and do a proper PR?
I think it is fine as it is rather standalone, thank you!
OK, I'll leave it alone. If it's useful to do floating precision conversions that could be easy to add, but I think most of the code does its own conversion if needed now.
@GUANGZChen Please add --save_cpu to your input file for now. I will close that.
I am currently encountering the same problems and do not wish to retrain. I tried making a python code to move to cpu. But I am having no success. I am running the script on CUDA enabled machine but it's not fixing the issue. If you could guide me on what I could do to save these models it would be great. Thanks!
import torch
def load_and_convert_model(model_path, output_path):
cpu_data = torch.load(model_path, map_location=torch.device("cpu"))
torch.save(cpu_data, output_path)
print(f"Model successfully converted and saved to {output_path}")
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser(description="Convert a PyTorch model saved on GPU to CPU")
parser.add_argument("input_model_path", type=str, help="Path to the GPU model file")
parser.add_argument("output_model_path", type=str, help="Path to save the converted CPU model file")
args = parser.parse_args()
load_and_convert_model(args.input_model_path, args.output_model_path)
If you are on a CUDA machine you need to do
def load_and_convert_model(model_path, output_path):
cuda_model= torch.load(model_path, map_location=torch.device("cuda"))
cpu_model = cuda_model.cpu()
torch.save(cpu_model, output_path)
print(f"Model successfully converted and saved to {output_path}")
If you know that you are going to use your model on cpu, please you the --save_cpu
flag, while training.
Description I’m encountering an issue when using mace_mp and MACECalculator from the MACE library on a CPU-only machine. Running the model in a CPU-only environment raises a NotImplementedError related to the aten::empty_strided operation. It appears that the code is attempting to execute a CUDA-specific operation on a CPU, despite specifying device='cpu'.
python
from mace.calculators import mace_mp, MACECalculator from pathlib import Path
Define the model path
model = Path("./potential/MACE_model_swa.model").expanduser()
Initialize mace_mp with CPU-only setup
calculator = mace_mp(model=model, device="cpu")
Alternatively, using MACECalculator
calculator = MACECalculator(model_paths=['./potential/MACE_model_swa.model'], device='cpu') Steps Taken to Resolve Installed the CPU-only version of PyTorch (pip install torch --index-url https://download.pytorch.org/whl/cpu). Verified that device='cpu' was explicitly set in both mace_mp and MACECalculator. Removed all CUDA-related environment variables to prevent CUDA library loading (unset CUDA_HOME, unset CUDA_PATH, unset LD_LIBRARY_PATH if it included CUDA paths). Tested basic CPU-only tensor operations in PyTorch, which work correctly outside of MACE. Expected Behavior With device='cpu' set, mace_mp and MACECalculator should avoid all CUDA dependencies and run exclusively on the CPU.
Actual Behavior
NotImplementedError: Could not run 'aten::empty_strided' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::empty_strided' is only available for these backends: [CPU, Meta, QuantizedCPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastXPU, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].
Environment Python version: 3.9 PyTorch version: CPU-only installation
Additional Context
I believe this issue may be due to a hardcoded CUDA dependency within mace_mp or MACECalculator. Is there a way to enforce CPU-only execution, or could there be a potential fix to handle CPU-only environments more gracefully?
Thank you for your assistance!