Yuliang-Liu / Curve-Text-Detector

This repository provides train&test code, dataset, det.&rec. annotation, evaluation script, annotation tool, and ranking.
639 stars 155 forks source link

Annotating labels! Word-2-Word only? #52

Closed innat closed 4 years ago

innat commented 4 years ago

@Yuliang-Liu I'm doing annotation using these tools. However, I'm sorry, I didn't check the manual until you've mentioned. It's great. Thank you.

However, I have a very basic question on annotation. While annotating, should we annotate word by word or should we annotate phrase-wise, or is something that depends on our end goal? To make my query cleaner, here is an example:

let's say, there's a sentence: Made In Australia, now while annotating, we can do:

  1. Made In Australia
  2. Made In Australia

In case 1. we label word-wise but in case 2. we label context-wise, right? So, which one is more appropriate? My understanding, our visual prediction will be the same as we label them. So, if I choose case 1., my prediction would be like that and if I choose case 2. my prediction would be like that. .......

Or, we should go with case 1. approach and case 2. is simply just a post-processing method? Is there anything a post-processing method?

Or, I'm just over-thinking. It mainly depends on our end goal. If we want to recognize word-wise, we should go with case 1. but if our end goal is something like case 2. then we should label the text in context-wise.

I hope you get my point. Thank you. :)

Yuliang-Liu commented 4 years ago

@innat Good to see you again,

It depends on your end goal. Word-level may cause less confusion, while text-line-level may be much intuitive and easy to annotate.

Specifically, to the best of our knowledge, there is not any objective conclusion showing whether word-level annotation or text-line-level annotation is better for a detector. Although word-level can reduce the degree of curvature, it requires many exquisite detections to separate the words. For example, as shown in below figure, word-level annotation requires smaller and more accurate detections to localize the words, even if the intervals may not be obvious, whereas line-level detection only requires single detection, which is much easier for detection in practice. image

Hope I address your concerns correctly.

innat commented 4 years ago

@Yuliang-Liu san, Thank you. It simplifies my concern. 💯