essunny310 / FreestyleNet

[CVPR 2023 Highlight] Freestyle Layout-to-Image Synthesis
https://essunny310.github.io/FreestyleNet/
MIT License
149 stars 3 forks source link

problem of the attention #10

Closed liuxingbin closed 11 months ago

liuxingbin commented 11 months ago

Hi, FreestyleNet is a great work. However, I am confused about the attention mechanism. During inference, we use attention_FLIS.py, which involves words with more than one token. Then why we should specifically design the attention for COCO and ADE20k?

I am looking forward to your reply.

essunny310 commented 11 months ago

Hi there, great question! You can indeed use attention_FLIS.py for COCO and ADE20K during inference as these lines of code would help us to handle words with more than one token. We opted to apply attention_ADE20K.py and attention_COCO.py, in the sense that this kind of "hard“ coding will save time especially during training.

Feel free to ask if you have any remaining questions.

liuxingbin commented 11 months ago

That's helped me a lot. Thanks for your reply.