The original paper describes the position embedding as a 5-dimension vector which contains the coordinates of the top left and bottom right corner plus a relative area. But in the data preprocessing script (seq2seq_loader.py), the variable _vispe, which I figure stands for the position embedding, has a dimension of 6. What causes the difference and what is the extra value used for?
The original paper describes the position embedding as a 5-dimension vector which contains the coordinates of the top left and bottom right corner plus a relative area. But in the data preprocessing script (seq2seq_loader.py), the variable _vispe, which I figure stands for the position embedding, has a dimension of 6. What causes the difference and what is the extra value used for?