Closed saskiabosma closed 5 months ago
Hi!
Good question. I will have to take a look.
Not sure if this could be the issue, but:
1) from a quick look, it seems like sentence transformers supports openai-derived models. Not sure if the fact that our model backbone is LAION CLIP makes a difference.
2) I wonder if the repo should be organized in a way that is similar to the one here https://huggingface.co/sentence-transformers/clip-ViT-B-32.
Just speculating, I'll try to give it a shot over the weekend! Let me know if you solve this!
Ok take a look at this: https://colab.research.google.com/drive/1BHNgrjax-4r7DH3sZU56yROJVrKZ7y3c?usp=sharing
I have downloaded both models (sentence-transformers' CLIP and FashionCLIP). I have then replicated the directory structure we see in CLIP in FashionCLIP. And I was able to load fashion-clip into sentence transformers.
Please double-check what I did, because I don't know if this actually makes sense (it might be a good idea to open an issue on sentence-transformers to see if someone can confirm this is the correct procedure).
If this works we can probably push the new folder structure online into another repo (ping @patrickjohncyh)
It works on my side too ! You don't even need to change the repo structure, if we replace the subfolder reference by "path": ".",
in modules.json (and copy modules.json
and config_sentence_transformers.json
in the main directory where the FashionCLIP artifacts are downloaded), it loads correctly. I also checked that it works with more recent versions of Pytorch and Transformers (2.0.1 and 4.26.1 respectively).
I will check sentence-transformers for official guidance on this, thanks for finding the fix !
ETA: Issue link.
Nice! Let's wait for confirmation on their side but I think this is the correct approach!
Hello !
I'm using this model as a text encoder. Since it is based on a CLIP-ViT-B-32 model, I was hoping that I could be able to use the sentence_transformers library (which explicitly supports this model family) for using it in OpenSearch for scalable vector search and/or exporting the model to ONNX or TorchScript formats.
(I'm looking for my ONNX/TorchScript model to take text as input, not processed tokens, so the usual export isn't adapted to my use case.)
I get the error "'CLIPConfig' object has no attribute 'hidden_size’" at the config.json load step when I try to load the FashionCLIP model in sentence_transformers. I checked the config.json of CLIP-ViT-B-32 model and, like FashionCLIP, it has an "hidden_size" field at the same nesting level. Does anyone have an idea of what could be blocking ? Any other suggestions for exporting a FashionCLIP text embedder to ONNX/TorchScript would also be very welcome !