kampta / DeepLayout

PyTorch implementation of "LayoutTransformer: Layout Generation and Completion with Self-attention" to appear in ICCV 2021
https://kampta.github.io/layout/
Apache License 2.0
151 stars 25 forks source link

Questions related to your paper #5

Open hbell99 opened 2 years ago

hbell99 commented 2 years ago

Hi~ I've read your paper and the code, and I am little confused about some details.

Firstly, the coordinates of bounding boxes are discretized and the vocabulary for layouts contains the 8-bit uniform quantization token (which is token indexes are 0-127 in your code), categorial token, padding token, bos and eos token. But what if during inference stage, the predicted coordinate of the trained model doesn't lie in 0-127 tokens? Will you do any post processing to correct these mis-predicted layouts?

Secondly, in your paper, it is said that 'we minimize KL-Divergence between soft-max predictions and output one-hot distribution with Label Smoothing', but why the code is still a cross entropy loss?

duzhenjiang113 commented 2 years ago

I also have a question. The paper described that the input order of the data is a random sequence, but from the code point of view, it is input in a specific sort of order.