I got this error when the model saver tried to clean the checkpoint directory:
Traceback (most recent call last):
File "/usr/local/bin/eole", line 33, in <module>
sys.exit(load_entry_point('EOLE', 'console_scripts', 'eole')())
File "wokdir/eole/eole/bin/main.py", line 39, in main
bin_cls.run(args)
File "/wokdir/eole/eole/bin/run/train.py", line 68, in run
train(config)
File "/wokdir//eole/eole/bin/run/train.py", line 55, in train
train_process(config, device_id=0)
File "/wokdir//eole/eole/train_single.py", line 248, in main
trainer.train(
File "/wokdir/eole/eole/trainer.py", line 363, in train
self.model_saver.save(step, moving_average=self.moving_average)
File "/wokdir//eole/eole/models/model_saver.py", line 319, in save
self._save(step)
File "/wokdir/eole/eole/models/model_saver.py", line 298, in _save
self.cleanup()
File "/wokdir//eole/eole/models/model_saver.py", line 135, in cleanup
shutil.rmtree(step_dir_to_delete)
File "/usr/lib/python3.10/shutil.py", line 715, in rmtree
onerror(os.lstat, path, sys.exc_info())
File "/usr/lib/python3.10/shutil.py", line 713, in rmtree
orig_st = os.lstat(path)
FileNotFoundError: [Errno 2] No such file or directory: 'step_500'
I got this error when the model saver tried to clean the checkpoint directory: