facebookresearch / fastMRI

A large-scale dataset of both raw MRI measurements and clinical MRI images.
https://fastmri.org
MIT License
1.28k stars 371 forks source link

Potential PyTorch Lightning dependency incompatibilities #251

Closed mmuckley closed 1 year ago

mmuckley commented 2 years ago

Some users have reported that the fastMRI from PyPI will install a newer version of PyTorch Lightning than is listed in requirements.txt. This version of PyTorch Lightning seems to have some bugs that throw exceptions when training for DDP. Resolution will likely require testing some new versions of PyTorch Lightning and potentially tweaking the modules.

If anyone has PyTorch Lightning dependency issues, it would help to share them here along with the versions of PTL, fastMRI and PyTorch that you're using so that we can get a better picture of what's going on.

hansenms commented 2 years ago

Actually the problem is the version from PyPI pulls in an older version of pytorch lightning. It is this commit https://github.com/facebookresearch/fastMRI/tree/c2432ff892b1af2e0520727c6913db0f39e23f8c that corresponds to fastmri==0.1.1 and that version will only work with older versions of PyTorch (I think 1.8 or older). I think the main issue is really the instructions, it says install PyTorch according to your platform and points to the PyTorch get started site. Those instructions will probably get you PyTorch 1.10 or 1.11, which will not work if you subsequently do pip install fastmri. It will work if you install from source pip install -e . since the requirements are a higher version of pytorch lightning.

There is an additional issue, if you do go with the pypi package and an older version of pytorch, you probably end up trying to train with a container image like pytorch/pytorch:1.8.0-cuda11.1-cudnn8-devel. I have not dug in, but you will get an error when trying to import torchtext in that image, so that package either has to be removed or you need a later image (which then exposes you to the problem above).

So I don't think there are major issues, but the instructions on the README are likely to lead you to a state where you can't train.

I think a simple fix might be to release a new pypi package (0.1.2 or some such thing) from the latest version of the code.

mmuckley commented 2 years ago

Very nice find @hansenms. I think we can probably do a new release given the merge of PR #205.

mmuckley commented 2 years ago

Okay @hansenms I just did a new release of the package. Installation should work better now. Let me know if you encounter any other issues.

mmuckley commented 1 year ago

Closing as the issue seems to be resolved.