I noticed that the BLIP captioning is not as good as the BLIP captioning in A1111, I was wondering if it's possible to add a better BLIP checkpoint or something?
For example, when I put a picture of a bald man I get captions like "a bald bald bald bald bald bald bald bald bald bald bald"
or captions like "a man with bald hair", which results in the models still outputting a man with hair, rather than a bald man
I noticed that the BLIP captioning is not as good as the BLIP captioning in A1111, I was wondering if it's possible to add a better BLIP checkpoint or something?
For example, when I put a picture of a bald man I get captions like "a bald bald bald bald bald bald bald bald bald bald bald" or captions like "a man with bald hair", which results in the models still outputting a man with hair, rather than a bald man