microsoft / table-transformer

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
MIT License
2.02k stars 232 forks source link

Does process_fintabnet.py not output labels for fine-tuning table detection? #123

Open lionely opened 1 year ago

lionely commented 1 year ago

With this script, can we only fine-tune for table structure recognition, or is fine-tuning for table structure recognition also improve table recognition for Financial tables?

Once again, thank you for your work!

bsmock commented 11 months ago

We probably should have included extra code to create table detection data, but I believe this was left out because we only do table structure recognition in our latest paper. It shouldn't be hard to add extra code for that if you're willing to look at the other scripts to see where in the code they create data for table detection.

Best, Brandon