Closed FransHk closed 6 months ago
Hi @FransHk , thanks for your attention to our work!
Were you referring to integrating TinyCLIP into Hugging Face, like this example? https://huggingface.co/openai/clip-vit-large-patch14
Hi, thanks for your reply. You are right. I am looking to export a TinyCLIP model in the HuggingFace 'ClipConfig' format so that I can load it into an existing codebase that expects this format.
For example, the code I am integrating TinyCLIP in loads vanilla CLIP like so:
from transformers import CLIPModel, CLIPConfig
configuration = CLIPConfig().from_pretrained(pretrained_model)
clip_model = CLIPModel.from_pretrained(pretrained_model, config=configuration)
The model that integrates CLIP for AR tasks is completely built around the HF-formatted CLIP, that's why I'm asking instead of re-implementing it for the open clip framework. Does this answer your question?
I see. Thanks @FransHk !
I am trying to integrate TinyCLIP into HF. I rarely use transformers, and I have encountered issues with CLIPConfig, so it will take some time to integrate TinyCLIP :)
Thank you @wkcn, looking forward to the results!
Hi @FransHk, I have integrated some TinyCLIP-ViT models into HF.
https://huggingface.co/collections/wkcn/tinyclip-model-zoo-6581aa105311fe07be88cb0d
Works like a charm, thanks!
Hi team,
I am integrating and comparing the brilliant set of TinyCLIP (ViT-based) architectures with the vanilla CLIP model in a number of language-enabled action recognition frameworks. Some of these AR models rely on HF model configurations, are there plans to release the TinyCLIP family to HF? Thanks!