HsiehYiChia / Scene-text-recognition

Scene text detection and recognition based on Extremal Region(ER)
MIT License
163 stars 55 forks source link

The steps of model training #6

Closed 09876qwrte closed 4 years ago

09876qwrte commented 6 years ago

Hi, Thank you for your work , Can you tell me The steps of model training more in detail , which function will be used, Thank you

09876qwrte commented 6 years ago

I only want to train the part of detection , without the OCR recognition

HsiehYiChia commented 6 years ago

You need to transform your data to LBP by get_lbp_data(). After that, feed the LBP data to train_cascade(). I'll fix this to make it easier in someday :P

09876qwrte commented 6 years ago

So,there only use get_lbp_data() and train_cascade() in the training? I have train the model by your data yesterday for 16 hours, but there have nothing result and log, Is this normal?

HsiehYiChia commented 6 years ago

Yes, for the training of detection classifier, it only involves

I can't recall the detail of them right now, but both functions should finish within few minutes. The code isn't too long to trace, so you can trace it and see how it works

09876qwrte commented 6 years ago

I found opencv_train() in your code , could this function take place of train_cascade(). and have the same result?

HsiehYiChia commented 6 years ago

This function will train Adaboost classifier for OpenCV's build-in Machine learning module, and that won't be compatible with my Adaboost. I add this function just for comparison between my Adaboost and OpenCV's Adaboost.

09876qwrte commented 6 years ago

Hi , I want to get the Binary MASK for every ER*, Can this be implemented in your code?

HsiehYiChia commented 6 years ago

That is feasible, I had already try that. However, both time and space complexity will increase, especially space complexity because there are enormous amount of ER in an image.

09876qwrte commented 6 years ago

I want to put the Binary MASK to CNN, so could you tell where you have realize it in your code,Thank you

HsiehYiChia commented 6 years ago

There are 2 way you can use:

  1. Use a link list of pixels for each ER. Whenever a pixel is accumulate to this ER, append the pixel to the link list. It is implemented in er_accumulate or er_tree_extract. You can utilize int pixel and struct plist of struct ER. A brief intro. of int pixel is it stands for the position of a pixel:
    _pixel%image_width = x ; pixel/image_width = y_
  2. Another work around is to binarize the ER since we keep tracking the level of every ER. You can utilize int level of struct ER. Keep in mind that this method may not give you exact pixel mask because the bounding box of ER could contain something which is not a text.
09876qwrte commented 6 years ago

Thank you very much, And I want to continue training on the basis of the last training result, so I would like to ask if I can use the last training result to initialize the current training model, If this way is ok, what should to change

HsiehYiChia commented 6 years ago

I am sorry, I don't get it. What do you mean "training on the basis of the last training result"?

09876qwrte commented 6 years ago

the last training result is the "strong.classifier" and "weak.classifier", The "training on the basis of the last training result" is like the finetune in the deep learning