clovaai / CRAFT-pytorch

Official implementation of Character Region Awareness for Text Detection (CRAFT)
MIT License
3.02k stars 858 forks source link

about watershed labeling. #27

Open wulitaotao1 opened 4 years ago

wulitaotao1 commented 4 years ago
   hi, I found that it is not so easy to split the Gaussian distribution map.

    can you provide details of the watered algorithm? 
    For example, the binarization method used here? 
   The details of the initial marker in the cv2.watershed() interface function in opencv? 
   Or is it using a different function interface?
YoungminBaek commented 4 years ago

I just followed the instruction provided by opencv document (https://docs.opencv.org/3.3.1/d3/db4/tutorial_py_watershed.html).

We used thresholding for the binary maps for finding three areas such as sure_fg, sure_bg, and unknown in the example. Two thresholds are used for separating those areas, and the values are 0.6 and 0.2, respectively. These thresholds are not sensitive for distinguishing those areas since they play a role of the initial guess for the watershed labeling. The initial markers are created by labeling the regions inside surely foreground area.

In addition, we used opencv watershed labeling function.

wulitaotao1 commented 4 years ago

Thanks for your reply.

I tried to use the method you said, but I found that the resulting character border does not cover the entire character area very well. It can only detect the vicinity of the center point of the character. Does this have a great influence on the model training?

In addition, in the ICDAR2015 dataset, the size of the image is generally 720 1280 3. In the image preprocessing, I first resize the image to 768 768 3, and then random crop into 640 640 3, but I found that some of the characters that were already small in the original image were smaller after the scale change, and the size of the gt was 0.5 times the size of the training img, which caused the Gaussian distribution of many characters to be very crowded and strange. How do you pre-process the training img?

Some words in the ICDAR data set are marked as "###", so the character length cannot be obtained. According to the paper, the confidence level of this part should be 1. Does such a word have an adverse effect on the training of the model? How do you deal with such words during the training of CRAFT?

Thank you and look forward to your reply.

Godricly commented 4 years ago

@wulitaotao1 IMHO,' ###' chars are ignored, so the score should be 0.

wulitaotao1 commented 4 years ago

@Godricly thank you so much. And how do you deal with small characters caused by scaling the original img? The gaussian map looks like very crowded, or maybe its wont have much impact on the task of detecting word-level text,I'm going to do some experiments.

wulitaotao1 commented 4 years ago

@YoungminBaek What's the input img of watershed algorithm?Region Score map or transformed img like gradient map?