microsoft / table-transformer

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
MIT License
2.22k stars 247 forks source link

BBox squezeed during inference. #31

Open sanjaychamlagain123 opened 2 years ago

sanjaychamlagain123 commented 2 years ago

I tried inference in the images provided in your repo as well as other images.. It works perfectly for the samples provided by you but the bbox seems to squeeze in my samples.

Screen Shot 2022-03-29 at 17 14 01
yang-chenyu104 commented 2 years ago

please can you tell me if there is Val data in thatPubTables1M-Structure-PASCAL-VOC/ folder,i don't have val and test data in the PubTables1M-Structure-PASCAL-VOC/

fullpro commented 2 years ago

I tried inference in the images provided in your repo as well as other images.. It works perfectly for the samples provided by you but the bbox seems to squeeze in my samples. Screen Shot 2022-03-29 at 17 14 01

Can you please provide the code for inferencce?

finnthedawg commented 2 years ago

You can try image = cv2.copyMakeBorder(image, 40,40,40,40, cv2.BORDER_CONSTANT, None, value=(255,255,255)) to add a margin before passing into the model. The corresponding word bounding boxes also need to be shifted for the post-processing step. It works perfectly on new tables when I add a margin ~40px.

I believe it may be due to the model being trained with tables that have ~40px margin in the training dataset.

@bsmock is this a good hotfix for tightly bounded tables if utilize the pre-trained weights you provided?

SamSamhuns commented 2 years ago

@finnthedawg, thanks a lot for this, it solved the squeezing issue for me

zanvari commented 1 year ago

Hi @sanjaychamlagain123 Can you please share your inference code? Thanks in advance!