GlassyWing / text-detection-ocr

Chinese text detection and recognition based on CTPN + DENSENET using Keras and Tensor Flow,使用keras和tensorflow基于CTPN+Densenet实现的中文文本检测和识别
Apache License 2.0
285 stars 116 forks source link

training error #1

Closed white2018 closed 5 years ago

white2018 commented 5 years ago

hi @GlassyWing

Thanks for your excellent work!

I trained the model and error happen as follows

ValueError: Error when checking target: expected rpn_class to have shape (None, None, 2) but got array with shape (1, 1, 7130)

what can I do to proceed training?

Thanks

GlassyWing commented 5 years ago

I'm not sure that, it's ok on my machine. Most likely, you used different versions of Tensor Flow. TensorFlow on my machine is 1.9.0

white2018 commented 5 years ago

I'm not sure that, it's ok on my machine. Most likely, you used different versions of Tensor Flow. TensorFlow on my machine is 1.9.0

The version of tensorflow on my machine is '1.0.1'. However, I substitute VOC2007 with VOC2012, since there is no VOC2007 in my environment. When I debug the code, it goes in function 'cal_rpn' of utils.py. The shape of labels returned by cal_rpn is something like Epoch 1/1 (6510,) (6200,) Is it correct?

GlassyWing commented 5 years ago

Yes, It is,, since the number of gt boxes in the image is different as picture size is not same. And note that, The dataset in VOC2007 contains only text and their gt boxes, is different from VOC published on http://host.robots.ox.ac.uk/pascal/VOC,you can download the dataset from https://drive.google.com/drive/folders/0B_WmJoEtfQhDRl82b1dJTjB2ZGc

white2018 commented 5 years ago

Thanks for your prompt reply and voc2007 link.

I will download it, then try again.

white2018 commented 5 years ago

Yes, It is,, since the number of gt boxes in the image is different as picture size is not same. And note that, The dataset in VOC2007 contains only text and their gt boxes, is different from VOC published on http://host.robots.ox.ac.uk/pascal/VOC,you can download the dataset from https://drive.google.com/drive/folders/0B_WmJoEtfQhDRl82b1dJTjB2ZGc

in core.py, for example, cls's shape is supposed to look like (None, 2) with the last dimension of 2

    cls = Lambda(_reshape3, output_shape=(None, 2), name='rpn_class')(cls)
    cls_prod = Activation('softmax', name='rpn_cls_softmax')(cls)

    regr = Lambda(_reshape3, output_shape=(None, 2), name='rpn_regress')(regr)

    predict_model = Model(input, [cls, regr, cls_prod])

    train_model = Model(input, [cls, regr])

However, in data_loader.py, cls's last dimension will not match the request of the model normally.

        [cls, regr], _ = cal_rpn((h, w), (int(h / 16), int(w / 16)), 16, gtbox)
        # zero-center by mean pixel
        m_img = img - IMAGE_MEAN
        m_img = np.expand_dims(m_img, axis=0)

        regr = np.hstack([cls.reshape(cls.shape[0], 1), regr])

        #
        cls = np.expand_dims(cls, axis=0)
        cls = np.expand_dims(cls, axis=1)

Should this pair be matching?

GlassyWing commented 5 years ago

This is normal, you will understand it if you look at the loss function "_rpn_loss_cls" in core.py. The model will generate hxwx 10 prediction boxes, each of which will have two probabilities to indicate the presence or absence of text.

white2018 commented 5 years ago

The error happens while compiling model in my case instead of computation of loss function. Here is the information. File "/usr/local/lib/python2.7/dist-packages/keras/legacy/interfaces.py", line 87, in wrapper return func(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 2096, in fit_generator class_weight=class_weight) File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1808, in train_on_batch check_batch_axis=True) ValueError: Error when checking target: expected rpn_class to have shape (None, None, 2) but got array with shape (1, 1, 6200)

maybe the higher version of keras or tensorflow support such statement?

My machine shows version as follows

keras.version '2.1.0' tensorflow.version '1.0.1'

GlassyWing commented 5 years ago

It's not keras or tensorflow, but python, This is my environment: python: 3.6 tensorflow: 1.9.0 keras: 2.2.4

And I recommend that you should use Anaconda to create a virtual environment. Anaconda is a Python environment management tool. https://www.anaconda.com/

white2018 commented 5 years ago

Thanks a lot!

It is time-consuming to prepare tensorflow stuff in my machine(PowerPC).

My solution is as follows:

1, concatenate cls with regr as output in training-model definition in core.py. output = concatenate([cls, regr], axis=-1) 2, make use of np.concatenate in data_loader.py to yield the corresponding data for training. y_batch = np.concatenate((cls, cls, regr), axis=-1) 3, encapsulate loss function as: def compute_loss(self, y_true, y_pred): cls_true, regr_true = y_true[:, :, :1], y_true[:, :, 1:] cls_pred, regr_pred = y_pred[:, :, :2], y_pred[:, :, 2:] return _rpn_loss_cls(cls_true, cls_pred) + _rpn_loss_regr(regr_true, regr_pred)

Now, it looks like the above problem is able to be avoided.