google-research-datasets / hiertext

The HierText dataset contains ~12k images from the Open Images dataset v6 with large amount of text entities. We provide word, line and paragraph level annotations.
Creative Commons Attribution Share Alike 4.0 International
261 stars 23 forks source link

how to do the evaluation of line-based unified detector? #13

Closed sword-shadow closed 1 year ago

sword-shadow commented 1 year ago

Hi, thanks for your great work, In the related paper Towards End-to-End Unified Scene Text Detection and Layout Analysis, it mentioned the evaluation of line-based detector in Table 3. However, the code here can only handle detected word vertices in word, line and paragraph levels.
To achieve the evaluation of line-based detector, should we just set the 'vertices' of 'word' as the detected line vertices in output Json file?

Jyouhou commented 1 year ago

As explained in the unified detector post, by "line based", it means the detector generate a set of masks where each corresponds to one text line. Taking the connected components, we get pseudo words that are stored as polygons.