Error with PyTorch 2.3.0: Missing '_refresh_per_optimizer_state' in 'torch.cuda.amp.grad_scaler'

Problem Description

After the recent update to PyTorch 2.3.0, petals encounters an import error when using the torch.cuda.amp.grad_scaler module. The specific error message is: ImportError: cannot import name '_refresh_per_optimizer_state' from 'torch.cuda.amp.grad_scaler' site-packages/torch/cuda/amp/grad_scaler.py)

The issue is due to changes in the new PyTorch version that are currently incompatible with current codebase.

Quick Workaround

Resolve this issue, by simply reverting the PyTorch version specified in setup.cfg from 'torch>=1.12' to 'torch==2.2.2', which is the last version known to work without this problem. Just so it works stably while I investigate changes in new pytorch and make the codebase to be compatible with PyTorch 2.3.0 or later.

Steps to Reproduce

Simply install petals using pip as suggested.
Run the the server.
Observe the import error.

Information

This error impacts all uses of the affected module in our project.
Reverting to PyTorch 2.2.2 has been tested locally and resolves the import error without introducing other known issues.

Action

Submitting a pull request to modifying the install_requires in our setup.cfg as described above, pending team feedback on this issue.

bigscience-workshop / petals