ysy31415 / unipaint

Code Implementation of "Uni-paint: A Unified Framework for Multimodal Image Inpainting with Pretrained Diffusion Model"
Apache License 2.0
107 stars 4 forks source link

About the detail of Masked attention control #2

Open siqi0905 opened 1 year ago

siqi0905 commented 1 year ago

In equation (9), how is I obtained? Is the dimension of I different in each attention layer? How are they obtained to generate ๐‘€๐‘Ž๐‘ก๐‘ก๐‘› [๐‘–, ๐‘—]? Thank you very much for your response๏ผ

ysy31415 commented 1 year ago

Hi, since we have the binary mask for input image, so we know the indices of known/unknown pixels. And yes, for layers with different resolutions (e.g., 64,32,16,8) , I has different sizes, the mask should be resized to match the corresponding resolution.

ๅ‘ไปถไบบ: ไธ‡ๆ€็ช @.> ๅ‘้€ๆ—ถ้—ด: 2023ๅนด10ๆœˆ14ๆ—ฅ 19:14 ๆ”ถไปถไบบ: ysy31415/unipaint @.> ๆŠ„้€: Subscribed @.***> ไธป้ข˜: [Ext] [ysy31415/unipaint] About the detail of Masked attention control (Issue #2)

CAUTION: External email. Do not reply, click on links or open attachments unless you recognize the sender and know the content is safe.

In equation (9), how is I obtained? Is the dimension of I different in each attention layer? How are they obtained to generate ๐‘€๐‘Ž๐‘ก๐‘ก๐‘› [๐‘–, ๐‘—]? Thank you very much for your response๏ผ

โ€” Reply to this email directly, view it on GitHubhttps://github.com/ysy31415/unipaint/issues/2, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BBVALB3NQXZNKT7F6XCYIODX7JXYHANCNFSM6AAAAAA6AF5KTA. You are receiving this because you are subscribed to this thread.Message ID: @.**@.>>

siqi0905 commented 1 year ago

if the size of the binary mask for the input image is (1, 512, 512), do you mean we just need to resize it to (1, 64, 64), (1, 32, 32), ..., (1, 8, 8)?

ysy31415 commented 1 year ago

Yes, thatโ€™s right