Closed tztechno closed 11 months ago
Most likely the file is not present at given location.
Another issue could be that if your xxxxx corresponds to a certain experiement, then it's path may contain a timestamp sub-folder. If you trained a model using trainer the right way to get the best checkpoint path is:
checkpoint_path=os.path.join(trainer.checkpoints_dir_path, "average_model.pth")
Thank you. As you pointed, I found timestamp sub-folder generated. Previously there was no timestamp sub-folder. Since timestamp is not a fixed value, the setting of trainer.checkpoints_dir_path is essential but would be difficult to find it by myself. Now I can do custom training for YOLO-NAS with 'set "mixed_precision": True' even on the CPU. Thank you again.
Hola, he llegado a este hilo por casualidad, estoy siguiendo la documentacion en "https://docs.deci.ai/super-gradients/latest/documentation/source/Example_Classification.html#5-training-checkpointing-and-transfer-learning" y tenia el mismo error. Seria interesante corregir el parrafo siguiente. Muchas gracias por vuestro trabajo. Buen fin de semana. Saludos desde Munich ;) import os
model = models.get(model_name=Models.RESNET18, num_classes=10, checkpoint_path=os.path.join(CHECKPOINT_DIR, experiment_name, 'ckpt_latest.pth'))
training_params["resume"] = True training_params["max_epochs"] = 25
trainer.train(model=model, training_params=training_params, train_loader=train_dataloader, valid_loader=valid_dataloader)
💡 Your Question
While training the model, even the Checkpoint path is correct, but FileNotFoundError occurs. Is there any way to avoid this error?
FileNotFoundError: Incorrect Checkpoint path: /kaggle/working/checkpoints/xxxxx/average_model.pth (This should be an absolute path)
Versions
No response