microsoft / table-transformer

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
MIT License
2.22k stars 247 forks source link

Simple inference code #44

Closed mzhadigerov closed 1 year ago

mzhadigerov commented 2 years ago

Hi! I have an image containing a table and I want to try the pretrained model for table structure recognition. I am unable to download the whole PubTables dataset since it is too big. What can I do to make a simple inference?

hannody commented 2 years ago

Same here! But found this https://github.com/microsoft/table-transformer/issues/22

peetio commented 2 years ago

Look at my fork, it contains an example on how you can extract the table content and turn it into a pandas DataFrame.

mzhadigerov commented 2 years ago

Thanks @thibaultvt . Suppose I downloaded the pre-trained model for table-structure recognition and put it on the root of your repository. What script should I run in order to launch the pipeline? e.g.: python main.py bla bla bla . I suppose it is just python main.py ,isn't it?

mzhadigerov commented 2 years ago

Also @thibaultvt , I don't want to perform OCR and convert everything into pandas DataFrame. All I want is to get bounding boxes of table-structure. Could you guide me through your code in order to understand which parts I should modify?

peetio commented 2 years ago

@mzhadigerov, in order to run the pipeline you can use python main.py since I didn't add arguments.

If you just want the bounding boxes you can either use the output of this line: objs = predictions_to_objects(results, threshold, get_class_map(key="index")).

To see the exact output format I suggest you print the objs print("Objects:", objs). This format made post processing easier.

Alternatively, you can use the results variable 'results = self.postprocessors["bbox"](outputs, image_size)[0]'.

Note that my fork is far from logically structured at the moment, and that I plan on refactoring it. I just needed it to experiment before implementing it in another project. If you have any other questions I suggest you open up an issue in the fork. Or if you wait 1 to 2 days, I will update the repository so that the code is more intuitive.