microsoft / table-transformer

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
MIT License
2.01k stars 231 forks source link

i try pre-train model but not good #173

Open PhamMinhTien05102001 opened 3 months ago

PhamMinhTien05102001 commented 3 months ago

i have image have table basic: 3

i have simple code:

image = Image.open('./imgs_test/3.jpg').convert("RGB") width, height = image.size image.resize((int(width0.5), int(height0.5))) from transformers import DetrFeatureExtractor

feature_extractor = DetrFeatureExtractor() encoding = feature_extractor(image, return_tensors="pt") encoding.keys()

from transformers import TableTransformerForObjectDetection

model = TableTransformerForObjectDetection.from_pretrained("microsoft/table-transformer-detection")

import torch

with torch.no_grad(): outputs = model(**encoding)

import matplotlib.pyplot as plt

colors for visualization

COLORS = [[0.000, 0.447, 0.741], [0.850, 0.325, 0.098], [0.929, 0.694, 0.125], [0.494, 0.184, 0.556], [0.466, 0.674, 0.188], [0.301, 0.745, 0.933]]

def plot_results(pil_img, scores, labels, boxes): plt.figure(figsize=(16,10)) plt.imshow(pil_img) ax = plt.gca() colors = COLORS * 100 for score, label, (xmin, ymin, xmax, ymax),c in zip(scores.tolist(), labels.tolist(), boxes.tolist(), colors): ax.add_patch(plt.Rectangle((xmin, ymin), xmax - xmin, ymax - ymin, fill=False, color=c, linewidth=3)) text = f'{model.config.id2label[label]}: {score:0.2f}' ax.text(xmin, ymin, text, fontsize=15, bbox=dict(facecolor='yellow', alpha=0.5))

rescale bounding boxes

width, height = image.size results = feature_extractor.post_process_object_detection(outputs, threshold=0, target_sizes=[(height, width)])[0] plot_results(image, results['scores'], results['labels'], results['boxes'])

THIS is result: a367c55d-99d1-4afe-ba3f-4a6f5397c79b i set thres = 0, but no tables is correct, help me pls. reason pre-train model not good?