pharmapsychotic / clip-interrogator-ext

Stable Diffusion WebUI extension for CLIP Interrogator
MIT License
503 stars 64 forks source link

Widely differing results between clip-interrogator-ext vs other Clip Interrogators #70

Open Aamir3d opened 1 year ago

Aamir3d commented 1 year ago

I was experimenting with this image image

  1. The result I got with the clip-interrogator-ext in Automatic1111 was the jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle, amazing exquisite matte painting, connected with hanging bridge!!, beautiful render of a fairytale, highly detailed 4 k art, fractal baroque, wooden bridge, enchanting, beautiful art uhd 4 k, asgard, romantic greenery

  2. This result is similar across a couple models including the ViT-L-14/openai or ViT-H-14/laion2b_s32b_b79k models.

  3. On using this image in the image2image CLIP button, I get the following result a bridge over a river with a waterfall and a bridge in the middle of it with a bridge over it, Chris LaBrooy, kinkade, a detailed matte painting, fantasy art

  4. On using this image on the Hugging face space https://huggingface.co/spaces/fffiloni/CLIP-Interrogator-2 a waterfall flowing through a lush green forest, a matte painting by Raymond Han, cg society contest winner, fantasy art, impressive fantasy landscape, 4k fantasy art, fantasy scenic

3 and 4 are similar in output, but 1 (the plugin for Automatic1111) gives very odd results with different photos. Could someone shine some light on this? Is it some bug? Results should be similar given the CLIP models are mostly similar.

cpietsch commented 1 year ago

have you tried the different modes like classic, fast, etc ?

Aamir3d commented 1 year ago

have you tried the different modes like classic, fast, etc ?

Tried this just now again, with repeated results. The image is the same as above.

Clip Interrogator Ext : Model ViT-L-14/openai

Best : the jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle, amazing exquisite matte painting, connected with hanging bridge!!, beautiful render of a fairytale, highly detailed 4 k art, fractal baroque, wooden bridge, enchanting, beautiful art uhd 4 k, asgard, romantic greenery

Classic : the jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle, a detailed matte painting by Michael James Smith, cgsociety, fantasy art, fantasy matte painting,cute, very beautiful matte painting, intricate matte painting

Fast : the jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle jungle, fantasy matte painting,cute, by Michael James Smith, very beautiful matte painting, intricate matte painting, beautiful matte painting, beautiful oil matte painting, stunning matte painting, a beuatiful matte painting, exquisite matte painting, epic rivendell fantasy

HuggingFace Clip Interrogator 2 : https://huggingface.co/spaces/fffiloni/CLIP-Interrogator-2

Best : a waterfall flowing through a lush green forest, by Raymond Han, fantasy art, mobile wallpaper, soaring towers and bridges, ultra hd wallpaper, mountain fortress city

Classic : a waterfall flowing through a lush green forest, a matte painting by Raymond Han, cg society contest winner, fantasy art, impressive fantasy landscape, 4k fantasy art, fantasy scenic

Fast : a waterfall flowing through a lush green forest, impressive fantasy landscape, 4k fantasy art, fantasy scenic, most epic landscape, epic fantasy landscape, 4 k hd wallpaper very detailed, detailed fantasy digital art, 4k highly detailed digital art, fantasy overgrown world, high fantasy landscape, intricate scenery, ancient ruins and waterfalls, sci-fi fantasy wallpaper

Cheesper commented 5 months ago

Have you found a solution? I also very often get an answer with repetitions in the first and main phrase

Cheesper commented 5 months ago

HuggingFace Clip Interrogator 2 : https://huggingface.co/spaces/fffiloni/CLIP-Interrogator-2

Mb problem of clip interrogator version, at hugginfaces its 0.5.4, here 0.6.0

for example prompt for this a teenage teenage teenage teenage teenage teenage teenage teenage teenage teenage teenage teenage teenage teenage teenage teenage teenage teenage teenage teenage, leonardo calamati, by Ross Tran, wielding longsword, by Wendell Minor, green and blue color scheme, aged turtle, without text, jose miguel roman frances, desenho lgBD0yrA2f5TY2sy

Aamir3d commented 5 months ago

Have you found a solution? I also very often get an answer with repetitions in the first and main phrase

Unfortunately not - I started using other apps and now there are other image interrogators with LLAVA or other VLMs that do a better descriptive job than the basic LLMs.

Cheesper commented 5 months ago

VLMs Thanks for the answer, I'll have to look for a solution too. I liked this extension for API, but with this bug it's useless.

Aamir3d commented 5 months ago

VLMs Thanks for the answer, I'll have to look for a solution too. I liked this extension for API, but with this bug it's useless.

Please check https://github.com/DEVAIEXP/image-interrogator for vlm support