Open antimora opened 8 months ago
I am inviting @laggui @ashdtu @nathanielsimard @louisfd @Luni-4, and others for your inputs.
Funny you mention that, I was just working on adding automatic loading of pre-trained weights to the ResNet models 😄 So great timing!
Since I haven't pushed any of my changes yet (PR should come soon), I'll summarize the way I am currently approaching this.
By default, the models support no_std
and I've added a pretrained
feature flag that requires std
and adds optional dependencies such as burn-import
crate to use the PyTorchFileRecorder
and burn/network
(new since this PR) to use the download_file_as_bytes
function with a download progress bar.
Regarding your specific points:
~/.cache
directory under the model name (e.g., ~/.cache/resnet-burn
).resnet*_pretrained
methods that do exactly as you described: download the .pth
checkpoint and use the PyTorchFileRecorder
to load them.Something that I would also like to see is exporting models without a specified backend. So users can chose the backend.
This ticket is a two fold request:
Now that we are adding popular models to the burn-model repo, we should consider the end user experience and come up with some basis top level requirements of what is expected when a user adopts/uses migrated model. This can evolve into a standard across other modes.
Here is my proposal: