Right now, the blank space is filled with one color.
A lot of uniform color data is passed to the diffusion model because of that, which I found can worsen the generated data in extreme cases where the dead space is significant.
There are at least 2 ways to fix this:
Hijack the diffusion model's attention to only consider the meanignful data.
Fill dead space with something else than a unified color.
I'm not sure how easy it would be to hijack the attention mechanism of the diffusion model, nor do I know if it would work at all.
Therefore, I was thinking that I could add the option of filling dead space with cv2.inpaint or something. Maybe that would help?
Another cool thing I could do is fill dead space by expanding the masks to take more pixels of the original image.
Actually, now that I think about it, this is what I should have done at first!
Right now, the blank space is filled with one color. A lot of uniform color data is passed to the diffusion model because of that, which I found can worsen the generated data in extreme cases where the dead space is significant.
There are at least 2 ways to fix this:
I'm not sure how easy it would be to hijack the attention mechanism of the diffusion model, nor do I know if it would work at all.
Therefore, I was thinking that I could add the option of filling dead space with cv2.inpaint or something. Maybe that would help? Another cool thing I could do is fill dead space by expanding the masks to take more pixels of the original image.
Actually, now that I think about it, this is what I should have done at first!
Tl;dr: Add 3 modes in total to fill dead space: