cs-chan / Total-Text-Dataset

Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.
BSD 3-Clause "New" or "Revised" License
747 stars 140 forks source link

How to parse the annotation file? #7

Closed vinayakarannil closed 5 years ago

vinayakarannil commented 5 years ago

Do you have any script how to parse the Polygon ground truth file?

ckchng commented 5 years ago

No we don't have any specific script but the format (both .txt and .mat) are clearly written under the groundtruth folder. It should be pretty easy to be parsed into your desired format. Do let us know if you need further assistance.

vinayakarannil commented 5 years ago

Thank you for replying. I am struggling to parse the files as it is not in any standard formats like json or xml. I think i will have to use some regex to parse the files

ckchng commented 5 years ago

If you happened to use Python or Matlab for your application, you can refer to the scripts below on how to parse our annotation. Hope it helps!

Python - https://github.com/cs-chan/Total-Text-Dataset/tree/master/Evaluation_Protocol/Python_scripts Matlab - https://github.com/cs-chan/Total-Text-Dataset/tree/master/Evaluation_Protocol

smartcatdog commented 4 years ago

No we don't have any specific script but the format (both .txt and .mat) are clearly written under the groundtruth folder. It should be pretty easy to be parsed into your desired format. Do let us know if you need further assistance.

thanks for your sharing, and i want to ask what is the meaning of the number of the mat format file, such as https://github.com/cs-chan/Total-Text-Dataset/blob/master/Evaluation_Protocol/Examples/Groundtruth/poly_gt_img1.mat