Skylarky commented 3 years ago

Hi, @Bartzi , recently, I want to learn the knowledge of end-to-end scene text recognition. I try to reproduce your excellent work to learn and understand this part of knowledge. But I had a problem running your train_svhn. py file:

TypeError: reshape() got an unexpected keyword argument 'order'

I want to ask how can I correct it? I would appreciate if you could give some explanation and help. Here's the code I used and the traceback：

`$ python train_svhn.py ../datasets/svhn/jsonfile/svhn_curriculum_specification.json ../datasets/svhn/runningLog/ -g 1 --char-map ../datasets/svhn/svhn_char_map.json --blank-label 10 -b 8

/anaconda3/envs/ssee/lib/python3.8/site-packages/chainer/training/updaters/multiprocess_parallel_updater.py:153: UserWarning: optimizer.eps is changed to 1e-08 by MultiprocessParallelUpdater for new batch size. warnings.warn('optimizer.eps is changed to {} ' epoch iteration main/loss main/accuracy lr fast_validation/main/loss fast_validation/main/accuracy validation/main/loss validation/main/accuracy Exception in main training loop: reshape() got an unexpected keyword argument 'order' Traceback (most recent call last):............................] 0.01% File "/home/d/anaconda3/envs/ssee/lib/python3.8/site-packages/chainer/training/trainer.py", line 346, in runtimated time to finish: 0:00:00. entry.extension(self) File "/home/d/SEE/see-master/chainer/insights/bbox_plotter.py", line 128, in call self.render_rois(predictions, rois, bboxes, iteration, self.image.copy(), backprop_vis=backprop_visualizations) File "/home/d/SEE/see-master/chainer/insights/bbox_plotter.py", line 143, in render_rois self.render_extracted_regions(dest_image, image, rois, num_timesteps) File "/home/d/SEE/see-master/chainer/insights/bbox_plotter.py", line 201, in render_extracted_regions rois = self.xp.reshape(rois, (num_timesteps, -1, num_channels, height, width)) File "/home/d/anaconda3/envs/ssee/lib/python3.8/site-packages/cupy/manipulation/shape.py", line 33, in reshape return a.reshape(newshape, order=order) Will finalize trainer extensions and updater before reraising the exception. Traceback (most recent call last): File "train_svhn.py", line 258, in trainer.run() File "/home/d/anaconda3/envs/ssee/lib/python3.8/site-packages/chainer/training/trainer.py", line 376, in run six.reraise(*exc_info) File "/home/d/anaconda3/envs/ssee/lib/python3.8/site-packages/six.py", line 719, in reraise raise value File "/home/d/anaconda3/envs/ssee/lib/python3.8/site-packages/chainer/training/trainer.py", line 346, in run entry.extension(self) File "/home/d/SEE/see-master/chainer/insights/bbox_plotter.py", line 128, in call self.render_rois(predictions, rois, bboxes, iteration, self.image.copy(), backprop_vis=backprop_visualizations) File "/home/d/SEE/see-master/chainer/insights/bbox_plotter.py", line 143, in render_rois self.render_extracted_regions(dest_image, image, rois, num_timesteps) File "/home/d/SEE/see-master/chainer/insights/bbox_plotter.py", line 201, in render_extracted_regions rois = self.xp.reshape(rois, (num_timesteps, -1, num_channels, height, width)) File "/home/d/anaconda3/envs/ssee/lib/python3.8/site-packages/cupy/manipulation/shape.py", line 33, in reshape return a.reshape(newshape, order=order) TypeError: reshape() got an unexpected keyword argument 'order' `

Bartzi commented 3 years ago

Hmm, I think this is an issue with the chainer/cupy version. What version of chainer/cupy are you using?

Skylarky commented 3 years ago

I'm sorry I didn't reply to you in time. My chainer version is 7.8.0 The version of cupy cuda101 I use is also 7.8.0. These versions work well with your FSNS_ Demo.py and did not find any problems.

Skylarky commented 3 years ago

Hi, @Bartzi. I realized that I seemed to have missed a similar example in issues. I tried the method in another problem with the same issues:

85

After I changed part of the code, although I could start training svhn, I encountered a new problem. When the number of iterations reaches 5000, the training will suddenly stop. 5000 I would like to ask what causes this situation, or how should I solve it? I would appreciate it very much and hope to get your help.

Bartzi commented 3 years ago

Your training stopped because the curriclum decided to go to the next step and enlarge the dataset. If you do not want to use curriculum learning, you'll need to disable it. A possible fix might be to change this line to trigger=(args.epochs, 'epoch'),

Skylarky commented 3 years ago

Thank you! You are right! After I changed that, I was able to start training normally. In addition, when I tried to train my svhn data model with all default parameters and only 1e-4 learning rate, I found that the accuracy was not high but only 75%. RunningLog Is there any possible way to improve the accuracy?

Bartzi commented 3 years ago

I'm pretty sure there are ways to improve the accuracy. However, you'll need to figure out what might be the problem. A good place to start is to investigate the output of the training in the boxes directory of the log directory. There you can find predictions of the model on one example. It might give you a good hint what works well and what doesn't.

Maybe you can post one of those images here?

Skylarky commented 3 years ago

Oh, sure! Here are 3 of the images from the box. 104372 104373 104374

Bartzi commented 3 years ago

Thanks for the images. Are these results from the last steps of the training? Should I explain to you what you are seeing?

Skylarky commented 3 years ago

Thanks. These results are from the final step of the training. As I am new to computer vision, I am not quite able to understand the meaning of these results in the figure, and I sincerely hope you can help me with my doubts.

Bartzi commented 3 years ago

Alright, on the left you can see the input image (which is one of the image taken from our simple svhn dataset) that comprises of 4 svhn digits placed on a grid. The colorful boxes represent the boxes predicted by the model. The views to the right of the top-left view show the content of each of the predicted boxes. The bottom row shows a visualization of the parts that might be interesting for the model, i.e. a visualization of the activated features done with VisualBackProp.

In the top-right corner, you can see the predictions of the model for each digit in the image, starting on the top-left, then top-right, bottom-right, ...

The results look quite odd. There is nothing correct in the image. However, you are reporting an accuracy of more than 70%. I don't know but something seems wrong here. It also seems that the model did not really learn to localize individual house numbers. I'm actually not 100% sure what the problem is :sweat_smile: last time I worked with this code, is nearly 4 years ago... Maybe you can get predictions on some more images? Just to see how the model behaves on different images. Maybe you should also check the code. It could be that the svhn code in this repo is optimized for training text recognition on a single house number and not real end-to-end recognition.

Skylarky commented 3 years ago

Wow, thanks again for your detailed response and explanation！！: )

Hahaha maybe it's because the lack of correct content on the image does make it difficult for me to understand the meaning of the image. XD

Anyway thanks again for your help and I will continue to try to check and optimize the code.

Other than that I am very interested in your work and would like to continue to try to reproduce your fsns data training and hope to get your guidance. : )

Bartzi / see

A running problem about train_ svhn.py #106

85