Bartzi / see

Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"
GNU General Public License v3.0
574 stars 147 forks source link

fsns_demo.py : ValueError: shape-mismatch for sum #58

Closed vcvycy closed 5 years ago

vcvycy commented 5 years ago

Hi: I tried to use the following command to recognize word_1.png (600px*150px) in win10: E:\data\Document\OCR\see\chainer>python fsns_demo.py ..\download\fsns_model\model model_35000.npz word_1.png ..\datasets\fsns\fsns_char_map.json

But it raised an exception : shape-mismatch for sum. Is there something wrong ?

Here is the problem: E:\data\Document\OCR\see\chainer>python fsns_demo.py ..\download\fsns_model\model model_35000.npz word_1.png ..\datasets\fsns\fsns_char_map.json (4, 4, 150, 150) Traceback (most recent call last): File "fsns_demo.py", line 153, in predictions, crops, grids = network(image[xp.newaxis, ...]) File "E:\data\Document\OCR\see\download\fsns_model\model\fsns.py", line 521, in call h = self.localization_net(images) File "E:\data\Document\OCR\see\download\fsns_model\model\fsns.py", line 184, in call h = self.bn0(self.conv0(images)) File "C:\Users\vcvyc\AppData\Local\Programs\Python\Python36\lib\site-packages\chainer\links\connection\convolution_2d.py", line 175, in call groups=self.groups) File "C:\Users\vcvyc\AppData\Local\Programs\Python\Python36\lib\site-packages\chainer\functions\connection\convolution_2d.py", line 582, in convolution_2d y, = fnode.apply(args) File "C:\Users\vcvyc\AppData\Local\Programs\Python\Python36\lib\site-packages\chainer\function_node.py", line 258, in apply outputs = self.forward(in_data) File "C:\Users\vcvyc\AppData\Local\Programs\Python\Python36\lib\site-packages\chainer\function_node.py", line 368, in forward return self.forward_cpu(inputs) File "C:\Users\vcvyc\AppData\Local\Programs\Python\Python36\lib\site-packages\chainer\functions\connection\convolution_2d.py", line 99, in forward_cpu return self._forward_cpu_core(x, W, b) File "C:\Users\vcvyc\AppData\Local\Programs\Python\Python36\lib\site-packages\chainer\functions\connection\convolution_2d.py", line 110, in _forward_cpu_core col, W, ((1, 2, 3), (1, 2, 3))).astype(x.dtype, copy=False) File "C:\Users\vcvyc\AppData\Local\Programs\Python\Python36\lib\site-packages\numpy\core\numeric.py", line 1283, in tensordot raise ValueError("shape-mismatch for sum") ValueError: shape-mismatch for sum

Bartzi commented 5 years ago

Hmm, I think the problem is that you are using a png with four channels, but the network is trained on input images with only three channels. You could convert the image to RGB in the dataloader, this should solve your problem.

vcvycy commented 5 years ago

Problem solved. Thank you !