Closed chunhui999 closed 6 years ago
@chunhui999 thanks for your interest. In order to write the train.pt you may need to prepare these data,(1) text/non-text region, (2) for every point in the text region, you need to calculate the distance between the current point to the four edges with an extra inclined ange. So there are 6 maps in total. Both (1) and (2) are data for text detection. You also need to get the corresonding gt for recognition. After preparing the data, you will need to use the bilinear point layer to get the sampling point feature. The reason we write two files is because the lstm implementation in caffe. So you can use the traditional lstm instead.
As to the second question, in order to fix the detection part you can set the detection loss to be zero.
@tonghe90 Thanks for your reply. What I want to do now is to retrain the network with the VGG800K dataset, which means repeating your previous work as a learning process. After learning about this original training process, I might use it to train my own data. So are you willing to share the training code? Such as train_val.pt, solver.pt, etc, or give a more detailed tutorial for some people interested in "scene text spotting" to learn.
@chunhui999 I will release the training.pt, but now I am busy other stuff, So it may take a few days
@tonghe90 Ok, thank you very much.
@tonghe90 I would like to ask you some questions about training: 1) How to build the train_val.prototxt file for training according to the two prototxt files test_iou.pt and test_lstm.pt for testing you have given? I am sorry that I have not used this branch network before. 2)In the paper, You mentioned the three steps of training. I want to know how to control the detection branch to be fixed or open it. Because I am a newbie, I hope that you can give me some guidance, of course, the more detailed the better, thank you very much.