microsoft / table-transformer

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
MIT License
2.31k stars 256 forks source link

Table Reconstruction #106

Open skwskwskwskw opened 1 year ago

skwskwskwskw commented 1 year ago

I am trying the table structure reconstruction pre-trained model. I have correct number of rows and columns detected, but the splitting is less optimal. Not sure what could be the issue and how to improve it?

Here's the sample of header: image

Here's the sample of splits:

image

bsmock commented 1 year ago

Hi, sometimes the padding around the table can affect the pre-trained model we released. But in that case only the edge rows and edge columns are usually affected. That's fixed by adding more padding around the table.

In your case, based on what I'm seeing, you'll probably need to fine-tune the model on a small number of cases like the one here, if your cases are all visually similar to this one. The pre-trained model has seen many table layouts but hasn't seen many examples that look like this one visually.

Some options you have are: 1) Training with additional data augmentation for PubTables-1M to make it generalize better to your cases 2) Fine-tuning the pre-trained model with FinTabNet using the scripts in this repo 3) Labeling your own small dataset and fine-tuning the model

Best, Brandon

skwskwskwskw commented 1 year ago

Hi,

Is there any form of desired padding or resizing needed before doing TSR with the model?

Thanks