VikParuchuri / surya

OCR, layout analysis, reading order, table recognition in 90+ languages
https://www.datalab.to
GNU General Public License v3.0
14.35k stars 899 forks source link

Bounding boxes in table recognitions are shifted #238

Open meugeny opened 2 weeks ago

meugeny commented 2 weeks ago

Hello, can you help me with table recognition?

I detect tables using command surya_table <path_to_jpg>, when I draw bounding boxes, I found that they are shifted from correct ones.
I guess I should add coordinates of starting point of the table. But I can't find table coordinates in json. Also I tried tabled repo, using python, but result is the same. How to fix it?

Online_with_bboxes

I got result json: {"Online Retail Customers Product & Price List April 2020": [{"cells": [{"bbox": [310.0, 42.0, 421.0, 68.0], "text": null}, {"bbox": [752.0, 42.0, 906.0, 69.0], "text": null}, {"bbox": [969.0, 43.0, 1102.0, 68.0], "text": null}, {"bbox": [220.0, 100.0, 503.0, 125.0], "text": null}, {"bbox": [0.0, 137.0, 53.0, 160.0], "text": null}, {"bbox": [828.0, 137.0, 861.0, 161.0], "text": null}, {"bbox": [1007.0, 137.0, 1063.0, 162.0], "text": null}, {"bbox": [0.0, 173.0, 391.0, 199.0], "text": null}, {"bbox": [999.0, 173.0, 1071.0, 198.0], "text": null}, {"bbox": [803.0, 174.0, 887.0, 198.0], "text": null}, {"bbox": [999.0, 209.0, 1071.0, 235.0], "text": null}, {"bbox": [2.0, 210.0, 377.0, 235.0], "text": null}, {"bbox": [803.0, 210.0, 888.0, 235.0], "text": null}, {"bbox": [3.0, 247.0, 239.0, 270.0], "text": null}, {"bbox": [769.0, 247.0, 922.0, 272.0], "text": null}, {"bbox": [999.0, 247.0, 1071.0, 271.0], "text": null}, {"bbox": [2.0, 283.0, 416.0, 308.0], "text": null}, {"bbox": [805.0, 284.0, 884.0, 308.0], "text": null}, {"bbox": [999.0, 284.0, 1071.0, 307.0], "text": null}, {"bbox": [3.0, 319.0, 452.0, 344.0], "text": null}, {"bbox": [999.0, 319.0, 1071.0, 344.0], "text": null}, {"bbox": [804.0, 321.0, 885.0, 346.0], "text": null}, {"bbox": [3.0, 356.0, 392.0, 381.0], "text": null}, {"bbox": [999.0, 357.0, 1071.0, 381.0], "text": null}, {"bbox": [805.0, 358.0, 884.0, 383.0], "text": null}, {"bbox": [2.0, 393.0, 428.0, 418.0], "text": null}, {"bbox": [999.0, 393.0, 1071.0, 418.0], "text": null}, {"bbox": [807.0, 395.0, 884.0, 418.0], "text": null}, {"bbox": [2.0, 430.0, 172.0, 455.0], "text": null}, {"bbox": [813.0, 430.0, 876.0, 454.0], "text": null}, {"bbox": [999.0, 430.0, 1071.0, 454.0], "text": null}, {"bbox": [3.0, 465.0, 206.0, 492.0], "text": null}, {"bbox": [1000.0, 465.0, 1072.0, 491.0], "text": null}, {"bbox": [815.0, 467.0, 877.0, 491.0], "text": null}, {"bbox": [0.0, 503.0, 156.0, 528.0], "text": null}, {"bbox": [815.0, 503.0, 875.0, 527.0], "text": null}, {"bbox": [999.0, 503.0, 1071.0, 528.0], "text": null}, {"bbox": [0.0, 537.0, 388.0, 566.0], "text": null}, {"bbox": [1000.0, 538.0, 1072.0, 565.0], "text": null}, {"bbox": [812.0, 540.0, 877.0, 563.0], "text": null}, {"bbox": [0.0, 575.0, 288.0, 600.0], "text": null}, {"bbox": [812.0, 575.0, 876.0, 601.0], "text": null}, {"bbox": [999.0, 576.0, 1071.0, 601.0], "text": null}, {"bbox": [0.0, 613.0, 319.0, 638.0], "text": null}, {"bbox": [813.0, 613.0, 876.0, 638.0], "text": null}, {"bbox": [999.0, 613.0, 1071.0, 637.0], "text": null}, {"bbox": [999.0, 648.0, 1071.0, 674.0], "text": null}, {"bbox": [0.0, 649.0, 378.0, 674.0], "text": null}, {"bbox": [813.0, 650.0, 878.0, 674.0], "text": null}, {"bbox": [2.0, 686.0, 282.0, 712.0], "text": null}, {"bbox": [819.0, 686.0, 870.0, 711.0], "text": null}, {"bbox": [999.0, 686.0, 1071.0, 711.0], "text": null}, {"bbox": [4.0, 759.0, 321.0, 784.0], "text": null}, {"bbox": [999.0, 759.0, 1071.0, 782.0], "text": null}, {"bbox": [812.0, 760.0, 877.0, 783.0], "text": null}, {"bbox": [2.0, 795.0, 321.0, 820.0], "text": null}, {"bbox": [812.0, 796.0, 877.0, 820.0], "text": null}, {"bbox": [999.0, 796.0, 1071.0, 819.0], "text": null}, {"bbox": [3.0, 832.0, 318.0, 858.0], "text": null}, {"bbox": [812.0, 832.0, 878.0, 857.0], "text": null}, {"bbox": [999.0, 832.0, 1071.0, 858.0], "text": null}, {"bbox": [2.0, 869.0, 253.0, 895.0], "text": null}, {"bbox": [813.0, 869.0, 877.0, 894.0], "text": null}, {"bbox": [999.0, 869.0, 1071.0, 894.0], "text": null}, {"bbox": [999.0, 905.0, 1071.0, 930.0], "text": null}, {"bbox": [1.0, 906.0, 257.0, 931.0], "text": null}, {"bbox": [812.0, 907.0, 878.0, 930.0], "text": null}, {"bbox": [3.0, 942.0, 305.0, 967.0], "text": null}, {"bbox": [999.0, 942.0, 1071.0, 967.0], "text": null}, {"bbox": [814.0, 943.0, 877.0, 966.0], "text": null}, {"bbox": [2.0, 1016.0, 107.0, 1042.0], "text": null}, {"bbox": [999.0, 1016.0, 1071.0, 1040.0], "text": null}, {"bbox": [820.0, 1017.0, 872.0, 1041.0], "text": null}, {"bbox": [177.0, 1089.0, 548.0, 1114.0], "text": null}, {"bbox": [820.0, 1124.0, 873.0, 1151.0], "text": null}, {"bbox": [0.0, 1125.0, 411.0, 1151.0], "text": null}, {"bbox": [999.0, 1126.0, 1071.0, 1150.0], "text": null}, {"bbox": [813.0, 1161.0, 878.0, 1185.0], "text": null}, {"bbox": [992.0, 1161.0, 1078.0, 1186.0], "text": null}, {"bbox": [0.0, 1162.0, 181.0, 1187.0], "text": null}, {"bbox": [0.0, 1198.0, 150.0, 1224.0], "text": null}, {"bbox": [813.0, 1198.0, 878.0, 1223.0], "text": null}, {"bbox": [999.0, 1198.0, 1071.0, 1221.0], "text": null}, {"bbox": [0.0, 1233.0, 292.0, 1259.0], "text": null}, {"bbox": [813.0, 1233.0, 878.0, 1259.0], "text": null}, {"bbox": [999.0, 1234.0, 1071.0, 1259.0], "text": null}, {"bbox": [1.0, 1307.0, 267.0, 1333.0], "text": null}, {"bbox": [999.0, 1307.0, 1072.0, 1331.0], "text": null}, {"bbox": [823.0, 1308.0, 869.0, 1334.0], "text": null}, {"bbox": [1.0, 1344.0, 255.0, 1370.0], "text": null}, {"bbox": [999.0, 1344.0, 1072.0, 1370.0], "text": null}, {"bbox": [1.0, 1380.0, 261.0, 1406.0], "text": null}, {"bbox": [992.0, 1381.0, 1079.0, 1406.0], "text": null}, {"bbox": [820.0, 1383.0, 871.0, 1407.0], "text": null}, {"bbox": [992.0, 1416.0, 1079.0, 1441.0], "text": null}, {"bbox": [1.0, 1417.0, 348.0, 1442.0], "text": null}, {"bbox": [820.0, 1417.0, 872.0, 1443.0], "text": null}], "rows": [{"bbox": [305.7919921875, 42.890625, 1096.5947265625, 68.625], "row_id": 0}, {"bbox": [217.302734375, 99.36328125, 499.572265625, 123.66796875], "row_id": 1}, {"bbox": [0.0, 136.53515625, 1061.87109375, 160.83984375], "row_id": 2}, {"bbox": [-0.56005859375, 171.5625, 1071.39208984375, 200.15625], "row_id": 3}, {"bbox": [0.56005859375, 208.01953125, 1070.27197265625, 235.18359375], "row_id": 4}, {"bbox": [2.80029296875, 246.62109375, 1070.27197265625, 270.92578125], "row_id": 5}, {"bbox": [1.1201171875, 282.36328125, 1069.7119140625, 306.66796875], "row_id": 6}, {"bbox": [0.56005859375, 318.10546875, 1070.27197265625, 345.26953125], "row_id": 7}, {"bbox": [0.56005859375, 355.9921875, 1070.27197265625, 381.7265625], "row_id": 8}, {"bbox": [0.56005859375, 392.44921875, 1070.27197265625, 416.75390625], "row_id": 9}, {"bbox": [1.1201171875, 429.62109375, 1069.7119140625, 453.92578125], "row_id": 10}, {"bbox": [1.68017578125, 466.078125, 1071.39208984375, 491.8125], "row_id": 11}, {"bbox": [0.0, 502.53515625, 1070.83203125, 526.83984375], "row_id": 12}, {"bbox": [-0.56005859375, 536.84765625, 1071.39208984375, 564.01171875], "row_id": 13}, {"bbox": [0.0, 575.44921875, 1070.83203125, 599.75390625], "row_id": 14}, {"bbox": [0.0, 612.62109375, 1070.83203125, 636.92578125], "row_id": 15}, {"bbox": [0.0, 646.93359375, 1070.83203125, 674.09765625], "row_id": 16}, {"bbox": [1.1201171875, 686.25, 1069.7119140625, 711.984375], "row_id": 17}, {"bbox": [2.80029296875, 758.44921875, 1070.27197265625, 782.75390625], "row_id": 18}, {"bbox": [1.1201171875, 796.3359375, 1069.7119140625, 819.2109375], "row_id": 19}, {"bbox": [2.240234375, 832.078125, 1070.83203125, 857.8125], "row_id": 20}, {"bbox": [0.0, 869.25, 1070.83203125, 894.984375], "row_id": 21}, {"bbox": [0.56005859375, 904.27734375, 1070.27197265625, 931.44140625], "row_id": 22}, {"bbox": [1.1201171875, 941.44921875, 1069.7119140625, 965.75390625], "row_id": 23}, {"bbox": [1.1201171875, 1017.22265625, 1069.7119140625, 1041.52734375], "row_id": 24}, {"bbox": [175.29833984375, 1088.70703125, 550.53759765625, 1113.01171875], "row_id": 25}, {"bbox": [-0.56005859375, 1123.734375, 1071.39208984375, 1152.328125], "row_id": 26}, {"bbox": [-1.1201171875, 1158.046875, 1076.4326171875, 1186.640625], "row_id": 27}, {"bbox": [0.0, 1197.36328125, 1070.83203125, 1224.52734375], "row_id": 28}, {"bbox": [0.0, 1233.10546875, 1070.83203125, 1257.41015625], "row_id": 29}, {"bbox": [1.1201171875, 1303.875, 1071.9521484375, 1332.46875], "row_id": 30}, {"bbox": [0.0, 1344.62109375, 1070.83203125, 1368.92578125], "row_id": 31}, {"bbox": [1.1201171875, 1373.9296875, 1078.6728515625, 1402.5234375], "row_id": 32}, {"bbox": [1.1201171875, 1413.9609375, 1078.6728515625, 1442.5546875], "row_id": 33}], "cols": [{"bbox": [-3.3603515625, 42.890625, 541.0166015625, 1441.125], "col_id": 0}, {"bbox": [802.56396484375, 43.60546875, 886.57275390625, 1440.41015625], "col_id": 1}, {"bbox": [999.14453125, 45.03515625, 1070.83203125, 1438.98046875], "col_id": 2}], "image_bbox": [0.0, 0.0, 1147.0, 1464.0], "page": 1, "table_idx": 0}]}