castorini / daam

Diffusion attentive attribution maps for interpreting Stable Diffusion.
MIT License
689 stars 63 forks source link

This is not an issue but a request. Can we run this on any image with generated captions? #29

Open runner22k opened 1 year ago

runner22k commented 1 year ago

This will help us in creating better descriptions of an image while training Textual inversion (embeddings) and LoRA. How to use this on any image?

does it work other-way around too? Can we feed an image to the DAAM and get text prompt with heat maps?

BingliangLi commented 1 year ago

I have the same question, does the model accept any non generated image and a given caption? I would like to use this model for zero shot object localization.

daemon commented 1 year ago

In theory yes, I'll look into implementing this further when I have time.

BingliangLi commented 1 year ago

I managed to achieve this using this plugin: link. Use img2img and set step to 1, denoising strength to 0, and you are all set!