Closed kdutia closed 2 years ago
Clustering of VGG16 embeddings (a general-purpose model) hasn't been successful - the clusters produced by both kmeans and gaussian mixture models don't successfully separate layouts, even at a column level.
See notebook here.
I can't find a finetuned document image classification model anywhere, so my next step is to try clustering embeddings from LayoutLMV2, which are fine-tuned on documents and contain positional text embeddings, as follows:
How does the corpus break down in terms of:
--