Transfomer: Why transformer is after image and text embedding which is different from paper?

wenwenyu / PICK-pytorch

Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)

https://arxiv.org/abs/2004.07464

MIT License

559 stars 193 forks source link

Transfomer: Why transformer is after image and text embedding which is different from paper? #73

Open jalola opened 3 years ago

jalola commented 3 years ago

Hi,

Thanks for the great work.

I see the comment here and so wonder why the transformer is put after the image and text embedding. If this is better than only using it on text segments? If yes, why don't we update the paper as well?