Bartzi / see

Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"
GNU General Public License v3.0
574 stars 147 forks source link

problem in fsns_demo.py: Invalid operation is performed in: LinearFunction (Forward) #54

Closed pinakinathc closed 5 years ago

pinakinathc commented 5 years ago

hi I am trying to run a sample image with fsns_demo.py using the following command:

python fsns_demo.py models/model model_35000.npz sample_image.jpg ../datasets/fsns/fsns_char_map.json --gpu=0

I am getting the following Error:

Input Image Shape:  (1, 3, 1000, 800)
Traceback (most recent call last):
  File "fsns_demo.py", line 153, in <module>
    predictions, crops, grids = network(image[xp.newaxis, ...])
  File "/home/pinaki/research/dl/see/see/chainer/models/model/fsns.py", line 523, in __call__
    h = self.localization_net(images)
  File "/home/pinaki/research/dl/see/see/chainer/models/model/fsns.py", line 206, in __call__
    lstm_prediction = F.relu(self.lstm(in_feature))
  File "/home/pinaki/research/dl/see/lib/python3.5/site-packages/chainer/links/connection/lstm.py", line 309, in __call__
    lstm_in = self.upward(x)
  File "/home/pinaki/research/dl/see/lib/python3.5/site-packages/chainer/links/connection/linear.py", line 129, in __call__
    return linear.linear(x, self.W, self.b)
  File "/home/pinaki/research/dl/see/lib/python3.5/site-packages/chainer/functions/connection/linear.py", line 118, in linear
    y, = LinearFunction().apply(args)
  File "/home/pinaki/research/dl/see/lib/python3.5/site-packages/chainer/function_node.py", line 230, in apply
    self._check_data_type_forward(in_data)
  File "/home/pinaki/research/dl/see/lib/python3.5/site-packages/chainer/function_node.py", line 298, in _check_data_type_forward
    self.check_type_forward(in_type)
  File "/home/pinaki/research/dl/see/lib/python3.5/site-packages/chainer/functions/connection/linear.py", line 20, in check_type_forward
    x_type.shape[1] == w_type.shape[1],
  File "/home/pinaki/research/dl/see/lib/python3.5/site-packages/chainer/utils/type_check.py", line 524, in expect
    expr.expect()
  File "/home/pinaki/research/dl/see/lib/python3.5/site-packages/chainer/utils/type_check.py", line 482, in expect
    '{0} {1} {2}'.format(left, self.inv, right))
chainer.utils.type_check.InvalidType: 
Invalid operation is performed in: LinearFunction (Forward)

Expect: in_types[0].shape[1] == in_types[1].shape[1]
Actual: 32208 != 3072

Now, I know that in #24 mentioned about resizing of the image . Can someone please help me fix this issue? It seems impossible to trace the source of the error. @Bartzi

Bartzi commented 5 years ago

Yes, your problem is the size of the input image, please resize it to have a height of 150px and a width of 600px, then it should work.

pinakinathc commented 5 years ago

okay. I did that after going through the code and log file. I added a resize function which would resize an image to height: 150px and width: 600px but then for the following image: please click this link to view the image

I got no recognition and also the output of OrderedDict is:

(see) pinaki@Krishna:~/research/dl/see/see/chainer$ python fsns_demo.py models/model model_35000.npz position.jpg ../datasets/fsns/fsns_char_map.json --gpu=0
OrderedDict([('',
              OrderedDict([('bottom_right',
                            (76.75752258300781, 79.74514770507812)),
                           ('top_left',
                            (56.44001388549805, 68.8976058959961))]))])

hence, I thought that I screwed up somewhere. @Bartzi

Bartzi commented 5 years ago

Well, the model won't work with such an image. Did you have a look at the FSNS Dataset? For your image, you should try the file text_recognition_demo.py.

pinakinathc commented 5 years ago

Thanks @Bartzi i downloaded a sample image from fsns dataset and now it is giving results.

Bartzi commented 5 years ago

You are welcome =)