microsoft / table-transformer

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
MIT License
2.01k stars 231 forks source link

unstable table structure recognition #115

Open SteveVu2212 opened 1 year ago

SteveVu2212 commented 1 year ago

Hi team, I am using Table Transformer to detect my tables' structures, specifically the table headers. However, the outputs seem unstable and fail sometimes. As you can see below, the header prediction is incorrect. Do you have any ideas to improve the performance of the model on this specific task?

table0-header

bsmock commented 1 year ago

The current pre-trained weights are for TATR trained on PubTables-1M. The PubTables-1M dataset covers a wide variety of table structures, but not all possible variations in things like color, etc. When we trained TATR on PubTables-1M, we did not attempt to optimize it for performance on tables outside of the PubTables-1M dataset.

Hopefully soon we can release our model trained jointly on both PubTables-1M and FinTabNet.c, from our most recent paper.

But in the meantime, for you to handle tables like the one above, likely a little bit of fine-tuning of the PubTables-1M model will be needed on additional data (like FinTabNet.c, which you can create using the script in this repo) or on additional augmentations of the PubTables-1M dataset.

Hope that helps!

Best, Brandon

Saeed11b95 commented 1 year ago

@bsmock any estimate on the timelines for release of pre-trained weights on Fintabnet. Thanks and regards

karthikgali commented 1 year ago

Hi @bsmock Please let us know if you have any timelines for release of pre-trained weights on Fintabnet.

Thanks,