There is a nice DocumentAI model called GeoLayoutLM by Alibaba Research. Accoding to their benchmarks, it is superior to LayoutLMv3 (while being roughly the same size) and they released their code here (Apache license). IMHO the license and good performance are really appealing....
Since you already added many DocumentAI/multi-modal models, I thought this one could be interesting as well.
Hey @NielsRogge!
There is a nice DocumentAI model called GeoLayoutLM by Alibaba Research. Accoding to their benchmarks, it is superior to LayoutLMv3 (while being roughly the same size) and they released their code here (Apache license). IMHO the license and good performance are really appealing....
Since you already added many DocumentAI/multi-modal models, I thought this one could be interesting as well.
And as always: Thanks for your amazing work! 😄