Adding CLIPSeg to HuggingFace Transformers 🤗

timojl / clipseg

This repository contains the code of the CVPR 2022 paper "Image Segmentation Using Text and Image Prompts".

Other

1.13k stars 107 forks source link

Adding CLIPSeg to HuggingFace Transformers 🤗 #18

Closed NielsRogge closed 2 years ago

NielsRogge commented 2 years ago

Hi,

Thanks for this awesome work. As I really liked the approach of adapting CLIP for zero and one-shot image segmentation, I implemented your model as a branch of 🤗 Transformers.

The model is soon going to be added to the main library (see https://github.com/huggingface/transformers/pull/20066). Here's a Colab notebook to showcase usage: https://colab.research.google.com/drive/1ijnW67ac6bMnda4D_XkdUbZfilyVZDOh?usp=sharing.

Would you like to create an organization on the hub, under which the checkpoints can be hosted?

Currently I host them under my own username: https://huggingface.co/models?other=clipseg.

Thanks!

Niels, ML Engineer @ HF

timojl commented 2 years ago

Hi Niels,

I'm glad you like our method and thanks for integrating it into HF Transformers! I guess, providing the weights under your user is fine for now.

NielsRogge commented 2 years ago

I've created a CIDAS organization on the hub, feel free to join :)

https://huggingface.co/CIDAS

NielsRogge commented 2 years ago

CLIPSeg is now available here: https://huggingface.co/docs/transformers/main/en/model_doc/clipseg.

Would be highly appreciated if you add a mention to it in the README 🙏

bonlime commented 1 year ago

@NielsRogge Hey, could you clarify one minor thing to me? This repo in the Quickstart.ipynb uses normalization with Imagenet mean/std, while the HF version used the normalization to [-1, 1] according to this. Which normalization is correct for the images?

I've seen that you've ported the weights somehow, but didn't you do something with the fact that different nomalization was used?

Also it's confusing that here you use Imagenet mean/std for tests.

upd. after few test it seem that this is really a bug and default processors params at HF are incorrect

NielsRogge commented 1 year ago

Hi,

Thanks for flagging. I'll update the mean and std of the checkpoints on the hub. Visually I don't see any differences though.

bonlime commented 1 year ago

The difference is minor, because the models in general a very robust to such things as slightly scaled inputs, but still it's better to use correct norm :) thanks for a quick fix