Closed Aakashagr closed 6 years ago
Hi @aakashiisc, it has been a while but I seem to recall that binary cross-entropy was thought to be more numerically stable than softmax in this case.
There was no specific reason to change it, but I found that binary CE seemed to work better on the examples.
As for the replication of matconvnet, I do not know if any such code exists. In any case, the best path forward would be to train in keras. This has always been on my list of things to do, but I have never gotten back to it.
If you do end up training in keras, could you push a PR to this so I could update the repo?
Hi, thank you for the prompt reply. I will try binary CE as well and check my results. I am a novice with deep networks and trying to learn it by replicating few results. I trained the network as mentioned in the paper but the performance was poor (not sure if I made a mistake or not).
Since I have been working only on MATLAB (for my other projects), I not very familiar with python/keras. But I will definitely let you know if I am able to successfully train this network in keras.
Just a curiosity search @aakashiisc but I think all of the Matlab code you need is located at: http://www.robots.ox.ac.uk/~vgg/research/text/#sec-models
There are two papers mentioned there, but the NIPS DLW 2014 models may be a very good place to start.
Good luck!
Thanks, I checked that page earlier, NIPS DLW 2014 models only contained pre-trained model (I was able to replicate their accuracy results by using it). But it does not include the code to train the network from scratch.
Even the code section under publication "Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition" reloads the same page.
Hi,
In your code, binary cross-entropy is used as a loss function to train the charnet model. However, in the original paper, the authors trained a separate fully connected layer for each position using softmax regression (multinomial logistic regression after softmax activation function) as a loss function. Is there any advantage in the current implementation.
Also, I am trying to replicate their results using matconvnet but could not find the original code to train the network from scratch. Please let me know if you had come across any.
Thanks..