naver-ai / rope-vit

[ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"
https://arxiv.org/abs/2403.13298
Other
200 stars 3 forks source link

Host models on HF #9

Closed NielsRogge closed 2 weeks ago

NielsRogge commented 1 month ago

Hi @bhheo,

Niels here from the open-source team at Hugging Face. I found your work through ECCV (congrats!), and indexed your paper here: https://huggingface.co/papers/2403.13298, congrats on getting it accepted to ECCV (feel free to claim authorship with your HF account). I work together with AK on improving the visibility of researchers' work on the hub.

I was wondering you'd be up for collaborating on releasing the checkpoints on the 🤗 hub, rather than Google Drive, to improve their discoverability. We can add tags so that people find them when filtering https://huggingface.co/models.

Uploading models

See here for a guide: https://huggingface.co/docs/hub/models-uploading. In case the models are custom PyTorch model, we could probably leverage the PyTorchModelHubMixin class which adds from_pretrained and push_to_hub to each model. Alternatively, one can leverages the hf_hub_download one-liner to download a checkpoint from the hub.

We encourage researchers to push each model checkpoint to a separate model repository, so that things like download stats also work. Moreover, we can then link the checkpoints to the paper page, improving their visibility.

Let me know if you're interested/need any help regarding this!

Cheers,

Niels ML Engineer @ HF 🤗

bhheo commented 1 month ago

Hi Niels

Thank you for good suggestion I totally agree that hf is better than google drive link I will try it with PyTorchModelHubMixin and ask you if I meet any problem

Best Heo

NielsRogge commented 1 month ago

Thank you! Btw would be great to push them to https://huggingface.co/naver

bhheo commented 1 month ago

Hi @NielsRogge

I have an issue while trying to convert my model to safetensors. Here's the code I'm using:

import huggingface_hub
huggingface_hub.login("")

import torch
from models import vit_rope

model = vit_rope.rope_axial_deit_small_patch16_LS()

# load checkpoint -> push to hub
checkpoint = torch.load("/mnt/tmp/weights/rope-vit/rope_axial_deit_small_patch16_LS.pth")
model.load_state_dict(checkpoint['model'])
model.save_pretrained("rope_axial_deit_small_patch16_LS")
# model.push_to_hub("bhheo/rope_axial_deit_small_patch16_LS")

# hub to model
model.from_pretrained("rope_axial_deit_small_patch16_LS")

But, I keep getting following error at the last line. Could you advise on what might be causing this and how to fix it?

  File "/mnt/image-net-full/bhheo/rope-vit/test.py", line 16, in <module>
    model.from_pretrained("rope_axial_deit_small_patch16_LS")
  File "/home/nsml/.local/lib/python3.8/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/home/nsml/.local/lib/python3.8/site-packages/huggingface_hub/hub_mixin.py", line 570, in from_pretrained
    instance = cls._from_pretrained(
  File "/home/nsml/.local/lib/python3.8/site-packages/huggingface_hub/hub_mixin.py", line 794, in _from_pretrained
    return cls._load_as_safetensor(model, model_file, map_location, strict)
  File "/home/nsml/.local/lib/python3.8/site-packages/huggingface_hub/hub_mixin.py", line 843, in _load_as_safetensor
    safetensors.torch.load_model(model, model_file, strict=strict, device=map_location)  # type: ignore [arg-type]
  File "/home/nsml/.local/lib/python3.8/site-packages/safetensors/torch.py", line 202, in load_model
    state_dict = load_file(filename, device=device)
  File "/home/nsml/.local/lib/python3.8/site-packages/safetensors/torch.py", line 315, in load_file
    result[k] = f.get_tensor(k)
RuntimeError: Viewing a tensor as a new dtype with a different number of bytes per element is not supported.
NielsRogge commented 1 month ago

Hi,

this is probably because of the fact that the model takes various arguments in its init method which aren't JSON serializable (like classes which are of type nn.Module). Hence it cannot serialize those to a config.json.

One could probably add a custom from_pretrained method which implements something similar to here: https://github.com/facebookresearch/vggsfm/blob/573fdff3f1c730d6be5ffa45c92b1487c4bdb658/vggsfm/models/vggsfm.py#L37.

Alternatively, a class could be defined which only takes in arguments which are serializable, which can inherit from PyTorchModelHubMixin.

Let me know if you need any help

bhheo commented 2 weeks ago

Hi @NielsRogge

I have uploaded my models on huggingface_hub Also, my code will download pre-trained model from hf_hub

Thank you for your recommendation and help My repo looks much better than before

Please let me know if you find something wrong in my update

NielsRogge commented 2 weeks ago

That's really cool, looking good! The model cards look great.

We could also add the relevant pipeline_tag for each model by including it at the top, like so:

---
pipeline_tag: image-classification
---

This way, people can find them from https://huggingface.co/models?pipeline_tag=image-classification.

Next to that, to get download metrics working for each model, you can push an empty config.json to each model repo. See https://huggingface.co/docs/hub/en/models-download-stats for details on how to enable them.

bhheo commented 2 weeks ago

@NielsRogge

Thanks for guiding me. I will add them to the models soon.

I have a question about config.json As I understand, an empty config.json is enough to enable download metrics.

But if I want to write model details to config.json, is there a format or rule for this?

I searched for references but these two are different and I confused https://huggingface.co/timm/deit3_base_patch16_224.fb_in1k/blob/main/config.json https://huggingface.co/facebook/deit-base-patch16-224/blob/main/config.json

NielsRogge commented 2 weeks ago

Hi,

The first one is meant to be used with the Timm library, the other one is meant to be used with the Transformers library. There are no rules for them, of course the Timm and Transformers library follow a standardized format.

Regarding your models, you could create a custom config.json which stores all parameters of the model. If you leverage the PyTorchModelHubMixin class, it will automatically serialize all the arguments of the init method into a config.json for you. Do note that for this class to work, all the arguments in your init method need to be JSON serializable (hence it can't be an nn.Module itself for instance).

Edit: I looked at your class and it seems like it takes various arguments which aren't JSON serializable: https://github.com/naver-ai/rope-vit/blob/6150a2ddacd1c937f501c80754939a2bb9c18ed0/deit/models_v2.py#L176-L183. Hence the Mixin cannot be leveraged here.

bhheo commented 2 weeks ago

Now I understand how it works Thank you for explanation

I have added pipeline_tag: image-classification and an empty config.json as your recommendation I will try PyTorchModelHubMixin at my next projects, its implementation really cool

Thank you for teaching me how to use huggingface platform It has a lot of good features for ML researchers

NielsRogge commented 2 weeks ago

Thanks a lot @bhheo! https://huggingface.co/papers/2403.13298 looks great now 👍