Can table tranformer be used to detect multiple tables in an image?

For this image, when i try the table transformer, i get only 1 prediction. Like if i crop the image and then run on individual crops, the results are as expected. But is it possible to pass the whole image and get multiple predictions?

Code used:

model = TableTransformerForObjectDetection.from_pretrained("microsoft/table-transformer-detection")
feature_extractor = DetrFeatureExtractor()
encoding = feature_extractor(image, return_tensors="pt")

with torch.no_grad():
  outputs = model(**encoding)

height, width = image.shape[:2] # HWC
results = feature_extractor.post_process_object_detection(outputs, threshold=0.4, target_sizes=[(height, width)])

Also the results when uploading image on : https://huggingface.co/microsoft/table-transformer-detection, and results when running on local seem to be different. Does anyone know why?

microsoft / table-transformer

Can table tranformer be used to detect multiple tables in an image? #157