libffcv / ffcv-imagenet

Train ImageNet *fast* in 500 lines of code with FFCV
Apache License 2.0
136 stars 34 forks source link

PR for resuming training from checkpoint and validate model loaded from disk #6

Open aniketrege opened 2 years ago

aniketrege commented 2 years ago

Hello! I have a modified FFCV branch which I have been using to load a model from a checkpoint (to avoid mid-training disconnect/random failures), as well as validate the final_weights.pt loaded from disk (eg, if we wish to run it on another val.ffcv file).

I would gladly raise a PR for review for this, if you'd like this feature to be added.