issues
search
osuossu8
/
paper-reading
6
stars
1
forks
source link
[2020] LayoutLM: Pre-training of Text and Layout for Document Image Understanding
#36
Open
osuossu8
opened
1 year ago
osuossu8
commented
1 year ago
https://arxiv.org/pdf/1912.13318.pdf
osuossu8
commented
1 year ago
Pre-training
IIT-CDIP Test Collection 1.0
6 million documents
text and metadata stored in XML files
11 million scanned document images
https://arxiv.org/pdf/1912.13318.pdf