More recent successors to the LayoutLM model used in this sample (e.g. LayoutLMv2 and LayoutXLM) make more extensive use of visual embeddings of the page image to boost performance. To get the most out of a possible model architecture upgrade, this sample should probably aim to integrate page image pixel analysis.
This is an additional output alongside the annotation-oriented images, because annotation needs good DPI but model embedding thumbnails are typically low-resolution - 224px square in standard LayoutLMv2
Processing should be configurable to output either or both of the image types, since annotation images may be required for a smaller subset of the corpus than page thumbnails
More recent successors to the LayoutLM model used in this sample (e.g. LayoutLMv2 and LayoutXLM) make more extensive use of visual embeddings of the page image to boost performance. To get the most out of a possible model architecture upgrade, this sample should probably aim to integrate page image pixel analysis.
Tentative items/components: