Regarding Text Mask Generation

TongkunGuan / SIGA

[CVPR2023] Self-supervised Implicit Glyph Attention for Text Recognition

105 stars 3 forks source link

Hello, thanks for your work. I thoroughly enjoyed reading the paper. I have a couple of questions regarding text mask generation.

During the training process of the segmentation network using the labels generated with k-means, did you employ image augmentations such as random transformations and color jittering. I have faced challenges with k-means on images that have color jittering.

I have also observed that for certain images predict the text pixels belong to cluster 0, while for others they are assigned to cluster 1 after performing k-means, depending on the color of the text. Could this potentially lead to challenges during the training of the segmentation model?

yes, we use transformations; 2. We noticed the situation, so we used a prior knowledge that the text is in the center of the image to distinguish the foreground and background. We will release the code soon.

TongkunGuan / SIGA

Regarding Text Mask Generation #1