Bartzi / see

Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"
GNU General Public License v3.0
573 stars 147 forks source link

Type check error with softmax cross entropy #92

Closed codeaway23 closed 4 years ago

codeaway23 commented 4 years ago

Hi,

Thanks for the awesome repository.

I am trying to train on a custom dataset. I have a charmap, my specification json and my dataset csv files ready. i run the following command -

python3 train_text_recognition.py /app/files/specs.json /app/logs --char-map=/app/files/char_map.json --batch-size=32 --gpu 0

and get the following error -

Exception in main training loop: 
Invalid operation is performed in: SoftmaxCrossEntropy (Forward)

Expect: in_types[0].shape[0] == in_types[1].shape[0]
Actual: 32 != 192
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/chainer/training/trainer.py", line 299, in run
    update()
  File "/usr/local/lib/python3.6/dist-packages/chainer/training/updater.py", line 223, in update
    self.update_core()
  File "/usr/local/lib/python3.6/dist-packages/chainer/training/updaters/multiprocess_parallel_updater.py", line 206, in update_core
    loss = _calc_loss(self._master, batch)
  File "/usr/local/lib/python3.6/dist-packages/chainer/training/updaters/multiprocess_parallel_updater.py", line 235, in _calc_loss
    return model(*in_arrays)
  File "/app/chainer/utils/multi_accuracy_classifier.py", line 45, in __call__
    self.loss = self.lossfun(self.y, t)
  File "/app/chainer/metrics/textrec_metrics.py", line 14, in calc_loss
    loss = self.calc_actual_loss(batch_predictions, None, t)
  File "/app/chainer/metrics/textrec_metrics.py", line 96, in calc_actual_loss
    return F.softmax_cross_entropy(predictions, labels)
  File "/usr/local/lib/python3.6/dist-packages/chainer/functions/loss/softmax_cross_entropy.py", line 380, in softmax_cross_entropy
    normalize, cache_score, class_weight, ignore_label, reduce)(x, t)
  File "/usr/local/lib/python3.6/dist-packages/chainer/function.py", line 235, in __call__
    ret = node.apply(inputs)
  File "/usr/local/lib/python3.6/dist-packages/chainer/function_node.py", line 230, in apply
    self._check_data_type_forward(in_data)
  File "/usr/local/lib/python3.6/dist-packages/chainer/function_node.py", line 298, in _check_data_type_forward
    self.check_type_forward(in_type)
  File "/usr/local/lib/python3.6/dist-packages/chainer/function.py", line 130, in check_type_forward
    self._function.check_type_forward(in_types)
  File "/usr/local/lib/python3.6/dist-packages/chainer/functions/loss/softmax_cross_entropy.py", line 77, in check_type_forward
    x_type.shape[2:] == t_type.shape[1:],
  File "/usr/local/lib/python3.6/dist-packages/chainer/utils/type_check.py", line 524, in expect
    expr.expect()
  File "/usr/local/lib/python3.6/dist-packages/chainer/utils/type_check.py", line 482, in expect
    '{0} {1} {2}'.format(left, self.inv, right))
Will finalize trainer extensions and updater before reraising the exception.
Traceback (most recent call last):
  File "train_text_recognition.py", line 299, in <module>
    trainer.run()
  File "/usr/local/lib/python3.6/dist-packages/chainer/training/trainer.py", line 313, in run
    six.reraise(*sys.exc_info())
  File "/usr/lib/python3/dist-packages/six.py", line 693, in reraise
    raise value
  File "/usr/local/lib/python3.6/dist-packages/chainer/training/trainer.py", line 299, in run
    update()
  File "/usr/local/lib/python3.6/dist-packages/chainer/training/updater.py", line 223, in update
    self.update_core()
  File "/usr/local/lib/python3.6/dist-packages/chainer/training/updaters/multiprocess_parallel_updater.py", line 206, in update_core
    loss = _calc_loss(self._master, batch)
  File "/usr/local/lib/python3.6/dist-packages/chainer/training/updaters/multiprocess_parallel_updater.py", line 235, in _calc_loss
    return model(*in_arrays)
  File "/app/chainer/utils/multi_accuracy_classifier.py", line 45, in __call__
    self.loss = self.lossfun(self.y, t)
  File "/app/chainer/metrics/textrec_metrics.py", line 14, in calc_loss
    loss = self.calc_actual_loss(batch_predictions, None, t)
  File "/app/chainer/metrics/textrec_metrics.py", line 96, in calc_actual_loss
    return F.softmax_cross_entropy(predictions, labels)
  File "/usr/local/lib/python3.6/dist-packages/chainer/functions/loss/softmax_cross_entropy.py", line 380, in softmax_cross_entropy
    normalize, cache_score, class_weight, ignore_label, reduce)(x, t)
  File "/usr/local/lib/python3.6/dist-packages/chainer/function.py", line 235, in __call__
    ret = node.apply(inputs)
  File "/usr/local/lib/python3.6/dist-packages/chainer/function_node.py", line 230, in apply
    self._check_data_type_forward(in_data)
  File "/usr/local/lib/python3.6/dist-packages/chainer/function_node.py", line 298, in _check_data_type_forward
    self.check_type_forward(in_type)
  File "/usr/local/lib/python3.6/dist-packages/chainer/function.py", line 130, in check_type_forward
    self._function.check_type_forward(in_types)
  File "/usr/local/lib/python3.6/dist-packages/chainer/functions/loss/softmax_cross_entropy.py", line 77, in check_type_forward
    x_type.shape[2:] == t_type.shape[1:],
  File "/usr/local/lib/python3.6/dist-packages/chainer/utils/type_check.py", line 524, in expect
    expr.expect()
  File "/usr/local/lib/python3.6/dist-packages/chainer/utils/type_check.py", line 482, in expect
    '{0} {1} {2}'.format(left, self.inv, right))
chainer.utils.type_check.InvalidType: 
Invalid operation is performed in: SoftmaxCrossEntropy (Forward)

Expect: in_types[0].shape[0] == in_types[1].shape[0]
Actual: 32 != 192

How do I fix this?

codeaway23 commented 4 years ago

Solved. My dataset metadata was 1 6 for 1 word with 6 characters but i had to change it to 6 1 and the training started.

I am facing another issue though. My validation accuracy doesn't move from 0%. even after 1000 of epochs. the training loss reduces and the training accuracy slowly reaches to 98% but there's absolutely no change in validation accuracy.

in fact my validation loss values increase with epochs. i have tried working with different batch sizes, cross checked my training files. I am using a learning rate of 0.0001. i am training on 1000 images and validating on 390.

any idea why this might be happening? and how I can fix this? Any help is appreciated.

Bartzi commented 4 years ago

Hmm, you created your own char_map? Could you show me an excerpt? The code produces some debugging output during training. If you look into the log folder, there is a sub-directory named bboxes. Here the code saves the predictions of the model on one image every 10 iterations. You can have a look at these images and see what happened over time. If the predicted bounding boxes are not visible anymore after some time, your learning rate might be too high. If you want to see how it might look like, you can download a video of this here (it is the video Text Recognition.mp4.)

Another thing that might be aproblem, is that you are only training on 1000 images. That is by far not enough. The only thing you could try is to fine-tune from one of our models.

codeaway23 commented 4 years ago

Thanks for the very constructive input. I looked through the video and found I should look at how I pre-process images better.

The bounding boxes are around the entire word and stop changing much after a point.

I was experimenting with a few different approaches. Was facing this issue of stagnant validation accuracies with attention ocr as well but was able to fix it with resizing and transferring inception weights.

I'm guessing the tranfer of your text recognition weights should help.

Again, thanks for the great repository.