Thanks for this great paper+codebase! A deserved best paper award.
I am thinking of applying the DAAM idea to image editing models (i.e. InstructPix2Pix: https://github.com/timothybrooks/instruct-pix2pix) and was wondering how hard you think it would be to get the same logic running in a non-diffusers code base such as the linked Pix2Pix which is older CompVis code for SD1.5? For example what are the key parts of your code that would need to be reimplemented, and the evaluation part (IOU on COCO-Gen with 0.4 threshold) from the paper?
Hi,
Thanks for this great paper+codebase! A deserved best paper award.
I am thinking of applying the DAAM idea to image editing models (i.e. InstructPix2Pix: https://github.com/timothybrooks/instruct-pix2pix) and was wondering how hard you think it would be to get the same logic running in a non-diffusers code base such as the linked Pix2Pix which is older CompVis code for SD1.5? For example what are the key parts of your code that would need to be reimplemented, and the evaluation part (IOU on COCO-Gen with 0.4 threshold) from the paper?
Let me know what you think! Best, Benno