mathDR / reading-text-in-the-wild

A Keras/Theano implementation of "Reading Text in the Wild with Convolutional Neural Networks" by M Jaderberg et.al.
GNU General Public License v3.0
116 stars 30 forks source link

poor performance and code error #7

Open yang53 opened 7 years ago

yang53 commented 7 years ago

in use_dictnet.py line 21: I have 1st error, I think it should be open('dict2_architecture.json') line 26: I have 2nd error, I think it should be self.model.load_weights('dict2_weights.h5') line 51: I have 3rd error, I think it should be z = self.model.predict_classes

in use_charnet.py line 59: I have 1st error, I think it should be filename = '../IMAGES/Chevron.jpg'

when I run use_dictnet.py and use_charnet.py, they all get right result for Chevron.jpg, but all wrong for CondoleezzaRice.jpg and CMA_CGM.jpg, I want to know why poor performance?have i made something wrong?

mathDR commented 7 years ago

Thanks @yang53 I merged the bug fixes you describe. As for the poor performance, this is an issue I am still working on. It basically has everything to do with the preprocessing of the image.

Note that the original Jaderberg network was trained with preprocessed images in MATLAB. I tried to replicate this preprocessing in numpy/scipy/scikit-image and I had is working on Ubuntu Linux, but am struggling to replicate it on MAC OSX.

Question: what operating system are you using?

yang53 commented 7 years ago

thanks Dan,I use Ubuntu Linux too. I run Jaderberg network in matlab, get the following result( it‘s your code result in bracket ): image 1 :./Image/CMA_CGM.jpg Detection with CHAR method 0.06s Predicted text: cmacgm (coacgoe)

Detection with DICT method 0.12s Predicted text: canoeing (cambering)

Detection with ngram method 0.05s


image 2 :./Image/Chevron.jpg Detection with CHAR method 0.04s Predicted text: chevron (chevron)

Detection with DICT method 0.10s Predicted text: chevron (chevron)

Detection with ngram method 0.05s


image 3 :./Image/CondoleezzaRice.jpg Detection with CHAR method 0.04s Predicted text: condlleeraacee (condeeeeaaaie)

Detection with DICT method 0.09s Predicted text: nonobservance (nonobservance)

Detection with ngram method 0.06s


image 4 :./Image/intro.jpg Detection with CHAR method 0.06s Predicted text: introducing (introducing)

Detection with DICT method 0.10s Predicted text: introducing (introducing)

Detection with ngram method 0.05s

so I think Jaderberg matlab network is only a little better than yours。 Maybe poor relsult from model

mathDR commented 7 years ago

If you have MATLAB available, check the equivalence of the preprocessed images (prior to inserting into the convnet). My suspicion is that when the two models differ in output, these preprocessed images will be different.