microsoft / table-transformer

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
MIT License
2.01k stars 231 forks source link

Table transformer crop issue #174

Open tzktz opened 3 months ago

tzktz commented 3 months ago

below table detection exactly crops the table.. but i need to some gap in right and left side... because when we pass the table crop image to ocr edges values were mismatche... download

How to adjust resize value.. i have set max resize into 800

class MaxResize(object):
    def __init__(self, max_size=800):
        self.max_size = max_size

    def __call__(self, image):
        width, height = image.size
        current_max_size = max(width, height)
        scale = self.max_size / current_max_size
        resized_image = image.resize((int(round(scale*width)), int(round(scale*height))))

        return resized_image

detection_transform = transforms.Compose([
    MaxResize(800),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

@bsmock @msftgits @themanojkumar @NielsRogge

abielr commented 2 months ago

@tzktz, instead of looking into the MaxResize class, what you need to do is take the detected table and then expand the detected bounding box when you crop the table. See the objects_to_crops function as an example, it has a padding argument that you can use to crop out the table with additional space around it.