MhLiao / TextBoxes

TextBoxes: A Fast Text Detector with a Single Deep Neural Network
https://github.com/MhLiao/TextBoxes
Other
633 stars 154 forks source link

Some questions about learning process #84

Open seovchinnikov opened 6 years ago

seovchinnikov commented 6 years ago

Hey! Thank you for your paper. I would like to ask a couple of questions:

  1. What is magnitude of the corresponding train/val loss in the end of the training on icdar13 when fscore of ~ 0.8 is reached?
  2. Does it matter how to name the text class if its the only one in xmls?
  3. How to debug caffe's image generator? I mean to look at warped images, for example

Thank you in advance

MhLiao commented 6 years ago
  1. You can watch the detection_eval to get a good model.
  2. It should match the class in your "labelmap" file
  3. I am sorry that I have no idea.
seovchinnikov commented 6 years ago

Thank you!

  1. Im trying to finetune on my small dataset ~ 1000 pictures but got detection_eval 0.11 after 500 iterations (iter_size=8, batch=4) and it does not get better after 1000... It seems like smth wrong with my data or augmentation for this data... I have vertical images about 128px width and 350px+-50px height with text lines aligned in the center of image. It seems like I need to play around with batch_sampler. I think this one is more appropriate for my case:

    'sampler': {
                        'min_scale': 0.6,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.2,
                        'max_aspect_ratio': 0.7,
    },
  2. By the way, does it work with negative images without any text? I put it do my dataset as well. Is it ok?