Project-MONAI / model-zoo

MONAI Model Zoo that hosts models in the MONAI Bundle format.
Apache License 2.0
174 stars 65 forks source link

Model weights should be stored to CPU for all bundles #518

Open ericspod opened 9 months ago

ericspod commented 9 months ago

To avoid issues when running bundles in CPU mode like that encountered below, all bundle weights should be stored on CPU. The alternative solution is to ensure any CheckpointLoader objects used have a map_location set to something which can be used without CUDA being present.

For those that support CPU-only operations, some way of testing bundles without the presence of CUDA might be nice too.

Discussed in https://github.com/Project-MONAI/model-zoo/discussions/516

Originally posted by **mpsampat** October 10, 2023 Hello! I am trying to run the inference.json for the WholeBody_ct_segmentation bundle. the inference.json file is here: https://github.com/Project-MONAI/model-zoo/blob/dev/models/wholeBody_ct_segmentation/configs/inference.json When I run on a CPU with 128 gb memory I get this error: "RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU." I tried to change line 15 in the inference.json file: https://github.com/Project-MONAI/model-zoo/blob/dev/models/wholeBody_ct_segmentation/configs/inference.json#L15 I changed it from "device": "$torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')", to "device": "cpu", But i still get the same error as above. 1. Is it possible to run monai bundle inference on a CPU ? 2. If yes, could you tell me what I am doing incorrectly ? thanks Mehul
mpsampat commented 8 months ago

Thank you @ericspod for creating this post! cc: @dbericat

ArthurRomansini commented 8 months ago

I've solved this issue by adding this " map_location=torch.device('cpu')" on the model load state, i think if you attempt to use

"device": "$torch.device('cpu')",

in your inference.json it will maybe work.

configPath = "./models/yourmodelhere/configs/inference.yaml"
config = ConfigParser()
config.read_config(configPath)

model = config.get_parsed_content("network")
model.load_state_dict(torch.load(modelPath, map_location=torch.device('cpu')))