MhLiao / MaskTextSpotter

A PyTorch implementation of Mask TextSpotter
https://github.com/MhLiao/MaskTextSpotter
413 stars 94 forks source link

how to expand to chinese? #3

Closed foocker closed 5 years ago

foocker commented 5 years ago

is a problem?(mask head will be too heavy that hard to convergence?)

MhLiao commented 5 years ago

We have conducted experiments on the MLT dataset in 4.6, which includes more than 7000 character categories. You can reduce MASK_BATCH_SIZE_PER_IM if out of memory happens.

foocker commented 5 years ago

We have conducted experiments on the MLT dataset in 4.6, which includes more than 7000 character categories. You can reduce MASK_BATCH_SIZE_PER_IM if out of memory happens.

i'm sorry! i have more question as follows:

  1. in engine/trainer.py from torch.distributed import deprecated as dist this means the pytorch version <=1.1, but your suggest is 1.2, something wrong?
  2. MLT data has no character mask label(and character box), the recognize branch is the mask branch in your model, character segementation need char mask label, i'm courious about how do you do the sequence recognition in chinese(English may from finetune).

  3. will you release the Standalone Recognition Model? Thanks for your reply.

foocker commented 5 years ago

We have conducted experiments on the MLT dataset in 4.6, which includes more than 7000 character categories. You can reduce MASK_BATCH_SIZE_PER_IM if out of memory happens.

i'm sorry! i have more question as follows:

1. in engine/trainer.py
   from torch.distributed import deprecated as dist
   this means the pytorch version <=1.1, but your suggest is  1.2, something wrong?

2. MLT data has no character mask label(and character box),  the recognize branch is the mask branch in your model,  character segementation need char mask label,  i'm courious about how  do you do  the sequence recognition in chinese(English may from finetune).

3. will you release the Standalone Recognition Model?
   Thanks for your reply.

ignore 1.

MhLiao commented 5 years ago

@foocker 1) I have fix the deprecated modules today. Please try the newest code. 2) It doesn't matter there are no character-level annotations (You can just disable the character segmentation branch). Actually, we did not use character-level annotations for MLT dataset for a fair comparison to the previous work. 3) I plan to release the standalone recognition model after the CVPR 2020 deadline.

WeihongM commented 5 years ago

@MhLiao Hello, Thanks for your open source code. you have mentioned that you dont use character-level annotations for MLT dataset, how you train the character classification?

MhLiao commented 5 years ago

@WeihongM You have two options: 1) You can just remove/disable the character classification if there are no character-level annotations for all your training data. (We choose this option for the experiments on MLT dataset in the paper.) 2) You do not need to remove/disable the character classification if there are some synthetic data that contain character-level annotations. You can achieve this by mixing all the data with a specific ratio in the training. For more details, you can refer to our TPAMI paper.

WeihongM commented 5 years ago

@MhLiao Hello, Thanks for your reply. One little problem, whether each batch you used to train in the network from the same dataset or not? Because I think if the dataset with character level and without character level in the same batch, the gradient backward process maybe more complicated?

MhLiao commented 5 years ago

@WeihongM We can use images from different datasets in a batch. The control of the gradient backward can be achieved by the loss function. We can just make the corresponding loss of character segmentation to zero for some instances which have no character-level annotations. The current code achieves this by using different labels or a flag to indicate which instances should be ignored.

WeihongM commented 5 years ago

@MhLiao Hello,I download the images and GTs from official ICDAR2013 website, It seems that char GTs stored in the segmentation challenge. However, after I download image from this challenge, I compare the image indexes (in segmentation challenge) and the image indexes(in the end to end challenge) can't match. Can you share how you match these files index, or can you send me the mapping of these image names. Thanks for your help.

MhLiao commented 5 years ago

@WeihongM I have checked it and find that the image indexes between the two tasks are matched. There are the converted GTs.

rkshuai commented 5 years ago

@foocker 1) I have fix the deprecated modules today. Please try the newest code. 2) It doesn't matter there are no character-level annotations (You can just disable the character segmentation branch). Actually, we did not use character-level annotations for MLT dataset for a fair comparison to the previous work. 3) I plan to release the standalone recognition model after the CVPR 2020 deadline.

请问对于mlt这种没有character-level的标注,不会像fots算法一样,在前期训练时检测不准确从而导致识别出现问题的情况吗?训练mlt是在预训练模型上微调的吗?