Closed WryingY closed 2 months ago
Hey @WryingY ! Thanks for your feedback =)
Regarding your question:
Each folder contains a set of files, and the folders are saved after each epoch. So the set is structured as follows: optimizer.bin - the states of torch's optimizer randomstates* - associated random states of the process scheduler.bin - LR scheduler's states and vgg19v4-64p-tiny-imagenet is a model's state dict
So if you need to load the trained model, it could be done simply with this code snippet:
model = ... # Build the model here
state_dict = torch.load(path_to_vgg19v4-64p-tiny-imagenet_file, map_location=torch.device('cpu'))
model.load_state_dict(state_dict)
If you need to resume training from the full state, you need to add manual loading of optimizer/scheduler/random states files to trainer.py file, right after creating optimizer, scheduler and dataloaders.
please, let me know if you need more help or clarification on this topic.
Thanks again for your speedy reply! Now I can resume my training easily with your solution. Looking forward to your fantastic work on Conv KAN in the future!
Thank you for putting forward this great project! When training models using your framework, I found some satisfying results (checkpoint folders like Fig. 1) and would like to fine-tune those models by loading weights from the checkpoint folders. However, it seems that in the trainer.py you didn't include the loading pretrained model part, and the weight files are confusing too (I'm not sure whether the file in Fig. 2 with no suffix could be loaded). It would be much appreciated if you could give insight into how the checkpoint weights are organized and how to load them in your trainer.py code :) Best Regards.