CyberAgentAILab / layout-dm

LayoutDM: Discrete Diffusion Model for Controllable Layout Generation [Inoue+, CVPR2023]
https://cyberagentailab.github.io/layout-dm
Apache License 2.0
211 stars 23 forks source link

Use Transformer encoder decoder backbone #25

Closed lifengheng closed 1 year ago

lifengheng commented 1 year ago

Hi, I notice layoutDM use Transformer encoder only backbone for faster generation at the sacrifice of quality in your paper. I want to know how much improvement can be achieved if layoutDM use Transformer encoder decoder backbone. Did you do some similar experiments before? I tried to add torch.nn.TransformerDecoder to layoutDM but the loss went wrong. Is layoutDM decoder different from nn.TransformerDecoder?

naoto0804 commented 1 year ago

Hi, using a Transformer encoder is not for faster generation. It is because the number of fields to be generated is fixed (c.f. a BERT-like model). In that case, it is natural to use a Transformer encoder.

lifengheng commented 1 year ago

Sorry, I got a wrong understanding of the encoder. Thanks for your reply.

naoto0804 commented 1 year ago

Happy to hear that you get it right!