microsoft / table-transformer

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
MIT License
2.02k stars 232 forks source link

Can any one provides a demo run exported onnx model? #120

Open jprorikon opened 1 year ago

jprorikon commented 1 year ago

Hi I export the official table-transformer-structure-recognition checkpoint to onnx format by using Optimum-cli with the following command. optimum-cli export onnx --model table-transformer-structure-recognition/ to_onnx/ --task object-detection

That gives me a single model.onnx file without any error output.

I noticed that the shape of the "pixel_mask" input becomes [1,64,64] in the onnx model, and in the original model should be the same shape as the "pixel_image" input.

I am really not sure if there is any problem with the model exporting. so I make an all 1 matrix as "pixel_mask" input and run the onnx model.

The model runs without any errors but gives me a different result.

code I used to test onnx model:

import onnxruntime as ort
import cv2
import numpy as np

session = ort.InferenceSession("model.onnx", providers=['CPUExecutionProvider'])
img = cv2.imread("./table.png")
org_img = img

# resize
h, w = img.shape[:2]
width = round(w * (800 / h))
img = cv2.resize(img, (width, 800))

# normalize
Blue, Green, Red = cv2.split(img)
Blue = (Blue.astype(np.float32) / 255 - 0.485) / 0.229
Green = (Green.astype(np.float32) / 255 - 0.456) / 0.224
Red = (Red.astype(np.float32) / 255 - 0.406) / 0.225
image = cv2.merge((Blue, Green, Red))

input_img = np.transpose(image, [2, 0, 1])

image = input_img[np.newaxis, :, :, :]

mask = np.ones([1, 64, 64], dtype=np.int64)

results = session.run(['logits', 'pred_boxes'], {'pixel_values': image, "pixel_mask": mask})

print("logits: ", results[0][0][0])
print("pred_boxes: ", tuple(results[1].shape))
nissansz commented 8 months ago

Any update?

nissansz commented 8 months ago

Can you share working onnx models for table detection and recognition? And also sample code to run the models.