wangf3014 / SCLIP

Official implementation of SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference
110 stars 9 forks source link

Reason on a single image and output any category #11

Open WuHenry-0609 opened 7 months ago

WuHenry-0609 commented 7 months ago

Hello, thank you for your hard work! Does the SCLIP model support reasoning on a single image? In addition, can the model output any category, that is, not limited by txt category files? I'm looking forward to your early reply.