If for whatever reason swa makes no progress, you get the error
/home/Software/python/system/torch/2.0.1/gpu/lib64/python3.9/site-packages/torch/jit/_check.py:172: UserWarning: The TorchScript type system doesn't support instance-level annotations on empty non-base types in `__init__`. Instead, either 1) use a type annotation in the class body, or 2) wrap the type in `torch.jit.Attribute`.
warnings.warn("The TorchScript type system doesn't support "
Traceback (most recent call last):
File "/home/cluster2/bernstei/.local/bin/mace_run_train", line 8, in <module>
sys.exit(main())
File "/home/cluster2/bernstei/src/work/MACE/mace_github/mace/cli/run_train.py", line 594, in main
epoch = checkpoint_handler.load_latest(
File "/home/cluster2/bernstei/src/work/MACE/mace_github/mace/tools/checkpoint.py", line 210, in load_latest
result = self.io.load_latest(swa=swa, device=device)
File "/home/cluster2/bernstei/src/work/MACE/mace_github/mace/tools/checkpoint.py", line 171, in load_latest
path = self._get_latest_checkpoint_path(swa=swa)
File "/home/cluster2/bernstei/src/work/MACE/mace_github/mace/tools/checkpoint.py", line 152, in _get_latest_checkpoint_path
return latest_checkpoint_info.path
UnboundLocalError: local variable 'latest_checkpoint_info' referenced before assignment
The run got to the end, but seems to crash when trying to save the best regular and swa checkpoints?
I'm guessing that latest_checkpoint_info is not defined because swa never made progress and hence no checkpoints were written during the swa phase.
If for whatever reason swa makes no progress, you get the error
The run got to the end, but seems to crash when trying to save the best regular and swa checkpoints?
I'm guessing that
latest_checkpoint_info
is not defined because swa never made progress and hence no checkpoints were written during the swa phase.