TongkunGuan / SIGA

[CVPR2023] Self-supervised Implicit Glyph Attention for Text Recognition
https://openaccess.thecvf.com/content/CVPR2023/papers/Guan_Self-Supervised_Implicit_Glyph_Attention_for_Text_Recognition_CVPR_2023_paper.pdf
105 stars 3 forks source link

Regarding Text Mask Generation #1

Closed strmojo closed 1 month ago

strmojo commented 1 year ago

Hello, thanks for your work. I thoroughly enjoyed reading the paper. I have a couple of questions regarding text mask generation.

  1. During the training process of the segmentation network using the labels generated with k-means, did you employ image augmentations such as random transformations and color jittering. I have faced challenges with k-means on images that have color jittering.
  2. I have also observed that for certain images predict the text pixels belong to cluster 0, while for others they are assigned to cluster 1 after performing k-means, depending on the color of the text. Could this potentially lead to challenges during the training of the segmentation model?
TongkunGuan commented 1 year ago

Hello, thanks for your work. I thoroughly enjoyed reading the paper. I have a couple of questions regarding text mask generation.

  1. During the training process of the segmentation network using the labels generated with k-means, did you employ image augmentations such as random transformations and color jittering. I have faced challenges with k-means on images that have color jittering.
  2. I have also observed that for certain images predict the text pixels belong to cluster 0, while for others they are assigned to cluster 1 after performing k-means, depending on the color of the text. Could this potentially lead to challenges during the training of the segmentation model?
  1. yes, we use transformations; 2. We noticed the situation, so we used a prior knowledge that the text is in the center of the image to distinguish the foreground and background. We will release the code soon.