google-research-datasets / vrdu

We identify the desiderata for a comprehensive benchmark and propose Visually Rich Document Understanding (VRDU). VRDU contains two datasets that represent several challenges: rich schema including diverse data types, complex templates, and diversity of layouts within a single document type.
76 stars 5 forks source link

explain the format of the annotations #2

Open XingWang1234 opened 1 year ago

XingWang1234 commented 1 year ago

"annotations": [["registration_num", [["3712\n", [0, 0.46376812, 0.32893434, 0.5, 0.3447707], [[2380, 2385]]]]] can you clarify the format of the above annotations? I can confirm that: "3712\n" is the value of registration_num. [0.46376812, 0.32893434, 0.5, 0.3447707] corresponds to [x_min, y_min, x_max, y_max] what do the numbers in bold mean? 0 before 0.46376812? and [2380, 2385]? Thank you so much.