facebookresearch / fastMRI

A large-scale dataset of both raw MRI measurements and clinical MRI images.
https://fastmri.org
MIT License
1.31k stars 373 forks source link

Running small dataset on CPU #40

Closed asaksena98 closed 4 years ago

asaksena98 commented 4 years ago

hello, I am running into some issues when I attempt to run the train_unet script on a small dataset on my local environment (which does not have any GPU capabilities).

When I run the script with the --gpus argument set to 0, I get the following error message: File "/Users/abhinavsaksena/mri/fastMRI/env/lib/python3.7/site-packages/torch/utils/tensorboard/summary.py", line 156, in hparams raise ValueError('value should be one of int, float, str, bool, or torch.Tensor') ValueError: value should be one of int, float, str, bool, or torch.Tensor

If I try to change the --gpu arg to 1, I get the following message: File "/Users/abhinavsaksena/mri/fastMRI/env/lib/python3.7/site-packages/pytorch_lightning/trainer/distrib_parts.py", line 533, in sanitize_gpu_ids raise MisconfigurationException(message) pytorch_lightning.utilities.debugging.MisconfigurationException: You requested GPUs: [0] But your machine only has: []

How might I go about fixing these issues and run the code on my CPU? Here are some similar issues I found online for reference: https://github.com/PyTorchLightning/pytorch-lightning/pull/609 https://github.com/PyTorchLightning/pytorch-lightning/issues/899

anuroopsriram commented 4 years ago

Hi: You would have to remove this line to run on the CPU: https://github.com/facebookresearch/fastMRI/blob/master/models/unet/train_unet.py#L186

You can also change the line to the following and then use --gpus 0: gpus=(args.gpus if args.gpus > 0 else None)

asaksena98 commented 4 years ago

Hello, thank you for your response. I had previously tried the first suggestion but it didn't work. The second one doesn't seem to work as well. I believe it might be some kind of issue with the version of pytorch-lightning being used (0.6.0).

mmuckley commented 4 years ago

This is confusing. argparse should cast it to an int so there should be no problem. I just ran it on my system with the first option, no error related to tensorboard. Let us know if you are able to fix this.

asaksena98 commented 4 years ago

I updated the pytorch-lightning version to 0.7.6 and that seemed to fix it.

mmuckley commented 4 years ago

Closing this as the problem is fixed.