Closed NielsRogge closed 2 weeks ago
Hi Niels
Thank you for good suggestion
I totally agree that hf is better than google drive link
I will try it with PyTorchModelHubMixin
and ask you if I meet any problem
Best Heo
Thank you! Btw would be great to push them to https://huggingface.co/naver
Hi @NielsRogge
I have an issue while trying to convert my model to safetensors. Here's the code I'm using:
import huggingface_hub
huggingface_hub.login("")
import torch
from models import vit_rope
model = vit_rope.rope_axial_deit_small_patch16_LS()
# load checkpoint -> push to hub
checkpoint = torch.load("/mnt/tmp/weights/rope-vit/rope_axial_deit_small_patch16_LS.pth")
model.load_state_dict(checkpoint['model'])
model.save_pretrained("rope_axial_deit_small_patch16_LS")
# model.push_to_hub("bhheo/rope_axial_deit_small_patch16_LS")
# hub to model
model.from_pretrained("rope_axial_deit_small_patch16_LS")
But, I keep getting following error at the last line. Could you advise on what might be causing this and how to fix it?
File "/mnt/image-net-full/bhheo/rope-vit/test.py", line 16, in <module>
model.from_pretrained("rope_axial_deit_small_patch16_LS")
File "/home/nsml/.local/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/home/nsml/.local/lib/python3.8/site-packages/huggingface_hub/hub_mixin.py", line 570, in from_pretrained
instance = cls._from_pretrained(
File "/home/nsml/.local/lib/python3.8/site-packages/huggingface_hub/hub_mixin.py", line 794, in _from_pretrained
return cls._load_as_safetensor(model, model_file, map_location, strict)
File "/home/nsml/.local/lib/python3.8/site-packages/huggingface_hub/hub_mixin.py", line 843, in _load_as_safetensor
safetensors.torch.load_model(model, model_file, strict=strict, device=map_location) # type: ignore [arg-type]
File "/home/nsml/.local/lib/python3.8/site-packages/safetensors/torch.py", line 202, in load_model
state_dict = load_file(filename, device=device)
File "/home/nsml/.local/lib/python3.8/site-packages/safetensors/torch.py", line 315, in load_file
result[k] = f.get_tensor(k)
RuntimeError: Viewing a tensor as a new dtype with a different number of bytes per element is not supported.
Hi,
this is probably because of the fact that the model takes various arguments in its init method which aren't JSON serializable (like classes which are of type nn.Module
). Hence it cannot serialize those to a config.json
.
One could probably add a custom from_pretrained
method which implements something similar to here: https://github.com/facebookresearch/vggsfm/blob/573fdff3f1c730d6be5ffa45c92b1487c4bdb658/vggsfm/models/vggsfm.py#L37.
Alternatively, a class could be defined which only takes in arguments which are serializable, which can inherit from PyTorchModelHubMixin
.
Let me know if you need any help
Hi @NielsRogge
I have uploaded my models on huggingface_hub Also, my code will download pre-trained model from hf_hub
Thank you for your recommendation and help My repo looks much better than before
Please let me know if you find something wrong in my update
That's really cool, looking good! The model cards look great.
We could also add the relevant pipeline_tag
for each model by including it at the top, like so:
---
pipeline_tag: image-classification
---
This way, people can find them from https://huggingface.co/models?pipeline_tag=image-classification.
Next to that, to get download metrics working for each model, you can push an empty config.json
to each model repo. See https://huggingface.co/docs/hub/en/models-download-stats for details on how to enable them.
@NielsRogge
Thanks for guiding me. I will add them to the models soon.
I have a question about config.json
As I understand, an empty config.json
is enough to enable download metrics.
But if I want to write model details to config.json,
is there a format or rule for this?
I searched for references but these two are different and I confused https://huggingface.co/timm/deit3_base_patch16_224.fb_in1k/blob/main/config.json https://huggingface.co/facebook/deit-base-patch16-224/blob/main/config.json
Hi,
The first one is meant to be used with the Timm library, the other one is meant to be used with the Transformers library. There are no rules for them, of course the Timm and Transformers library follow a standardized format.
Regarding your models, you could create a custom config.json
which stores all parameters of the model. If you leverage the PyTorchModelHubMixin class, it will automatically serialize all the arguments of the init method into a config.json
for you. Do note that for this class to work, all the arguments in your init method need to be JSON serializable (hence it can't be an nn.Module
itself for instance).
Edit: I looked at your class and it seems like it takes various arguments which aren't JSON serializable: https://github.com/naver-ai/rope-vit/blob/6150a2ddacd1c937f501c80754939a2bb9c18ed0/deit/models_v2.py#L176-L183. Hence the Mixin cannot be leveraged here.
Now I understand how it works Thank you for explanation
I have added pipeline_tag: image-classification
and an empty config.json
as your recommendation
I will try PyTorchModelHubMixin
at my next projects, its implementation really cool
Thank you for teaching me how to use huggingface platform It has a lot of good features for ML researchers
Thanks a lot @bhheo! https://huggingface.co/papers/2403.13298 looks great now 👍
Hi @bhheo,
Niels here from the open-source team at Hugging Face. I found your work through ECCV (congrats!), and indexed your paper here: https://huggingface.co/papers/2403.13298, congrats on getting it accepted to ECCV (feel free to claim authorship with your HF account). I work together with AK on improving the visibility of researchers' work on the hub.
I was wondering you'd be up for collaborating on releasing the checkpoints on the 🤗 hub, rather than Google Drive, to improve their discoverability. We can add tags so that people find them when filtering https://huggingface.co/models.
Uploading models
See here for a guide: https://huggingface.co/docs/hub/models-uploading. In case the models are custom PyTorch model, we could probably leverage the PyTorchModelHubMixin class which adds
from_pretrained
andpush_to_hub
to each model. Alternatively, one can leverages the hf_hub_download one-liner to download a checkpoint from the hub.We encourage researchers to push each model checkpoint to a separate model repository, so that things like download stats also work. Moreover, we can then link the checkpoints to the paper page, improving their visibility.
Let me know if you're interested/need any help regarding this!
Cheers,
Niels ML Engineer @ HF 🤗