yanndebray / programming-GPTs

Book in writing ... 🦜
https://yanndebray.github.io/programming-GPTs/
MIT License
3 stars 0 forks source link

Chap 7 - Investigate LayoutLM alternative to GPT-4V #12

Open yanndebray opened 1 month ago

yanndebray commented 1 month ago

LayoutLM is another language model (not as large as GPTs) that extends the BERT architecture to incorporate the layout information of the document, such as the bounding boxes, sizes, and positions of the text segments. The model can encode both the textual and visual features of the document and perform tasks such as document classification, form understanding, or entity extraction.