Hi. GroupViT is an excellent work.
I wonder if the groupViT has any Open-vocabulary characteristic, likes, if we want to segement a cat, we don't input the lable of "cat" , instead, we input some words like "pet" or "furry" just like CLIP, can GroupViT works like that?
Thanks !
Hi. GroupViT is an excellent work. I wonder if the groupViT has any Open-vocabulary characteristic, likes, if we want to segement a cat, we don't input the lable of "cat" , instead, we input some words like "pet" or "furry" just like CLIP, can GroupViT works like that? Thanks !