Old implicit auto-resume failed when no checkpoint.
It is change in PL.
Error
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.7/dist-packages/rnnms/main_train.py", line 37, in <module>
main_train()
File "/usr/local/lib/python3.7/dist-packages/rnnms/main_train.py", line 33, in main_train
train(args_scpt, datamodule)
File "/usr/local/lib/python3.7/dist-packages/rnnms/train.py", line 43, in train
trainer.fit(model, datamodule=datamodule)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 550, in fit
self.checkpoint_connector.resume_start()
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py", line 68, in resume_start
raise FileNotFoundError(f"Checkpoint at {checkpoint_path} not found. Aborting training.")
FileNotFoundError: Checkpoint at gdrive/MyDrive/ML_results/rnnms/2021/version_1/checkpoints/last.ckpt not found. Aborting training.
Expected cause
Try resume but failed because of missing checkpoint.
PL change the behavior.
Summary
Old implicit auto-resume failed when no checkpoint.
It is change in PL.
Error
Expected cause
Try resume but failed because of missing checkpoint.
PL change the behavior.