LayoutReader - Decoder - Githubissues

Describe Model I am using: LayoutReader (Is this a good place to ask questions about papers)

In figure 3 of the paper, I am having trouble understanding the decoder. If we can calculate the probability that x_k should be at index i (using just the encoder) when reading in order, can't we treat it like a classification problem? (mapping token to order index: x_1 -> 1, x_2 -> 3, x_3 -> 2)