Closed mmuckley closed 1 year ago
Actually the problem is the version from PyPI pulls in an older version of pytorch lightning. It is this commit https://github.com/facebookresearch/fastMRI/tree/c2432ff892b1af2e0520727c6913db0f39e23f8c that corresponds to fastmri==0.1.1
and that version will only work with older versions of PyTorch (I think 1.8 or older). I think the main issue is really the instructions, it says install PyTorch according to your platform and points to the PyTorch get started site. Those instructions will probably get you PyTorch 1.10 or 1.11, which will not work if you subsequently do pip install fastmri
. It will work if you install from source pip install -e .
since the requirements are a higher version of pytorch lightning.
There is an additional issue, if you do go with the pypi package and an older version of pytorch, you probably end up trying to train with a container image like pytorch/pytorch:1.8.0-cuda11.1-cudnn8-devel
. I have not dug in, but you will get an error when trying to import torchtext in that image, so that package either has to be removed or you need a later image (which then exposes you to the problem above).
So I don't think there are major issues, but the instructions on the README are likely to lead you to a state where you can't train.
I think a simple fix might be to release a new pypi package (0.1.2 or some such thing) from the latest version of the code.
Very nice find @hansenms. I think we can probably do a new release given the merge of PR #205.
Okay @hansenms I just did a new release of the package. Installation should work better now. Let me know if you encounter any other issues.
Closing as the issue seems to be resolved.
Some users have reported that the fastMRI from PyPI will install a newer version of PyTorch Lightning than is listed in requirements.txt. This version of PyTorch Lightning seems to have some bugs that throw exceptions when training for DDP. Resolution will likely require testing some new versions of PyTorch Lightning and potentially tweaking the modules.
If anyone has PyTorch Lightning dependency issues, it would help to share them here along with the versions of PTL, fastMRI and PyTorch that you're using so that we can get a better picture of what's going on.