tarepan / UniversalVocoding

A PyTorch implementation of "Robust Universal Neural Vocoding"
https://tarepan.github.io/UniversalVocoding
MIT License
2 stars 1 forks source link

Auto-resume failed when no checkpoint #5

Closed tarepan closed 3 years ago

tarepan commented 3 years ago

Summary

Old implicit auto-resume failed when no checkpoint.
It is change in PL.

Error

Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.7/dist-packages/rnnms/main_train.py", line 37, in <module>
    main_train()
  File "/usr/local/lib/python3.7/dist-packages/rnnms/main_train.py", line 33, in main_train
    train(args_scpt, datamodule)
  File "/usr/local/lib/python3.7/dist-packages/rnnms/train.py", line 43, in train
    trainer.fit(model, datamodule=datamodule)
  File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 550, in fit
    self.checkpoint_connector.resume_start()
  File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/connectors/checkpoint_connector.py", line 68, in resume_start
    raise FileNotFoundError(f"Checkpoint at {checkpoint_path} not found. Aborting training.")
FileNotFoundError: Checkpoint at gdrive/MyDrive/ML_results/rnnms/2021/version_1/checkpoints/last.ckpt not found. Aborting training.

Expected cause

Try resume but failed because of missing checkpoint.
PL change the behavior.

tarepan commented 3 years ago

Fixed.