Closed vaibhav541 closed 6 years ago
Yes, the input size should be 600 x 150
. A typical FSNS image include 4 views of the same street name sign, each view is 150 x 150
pixels in dimension.
Actually i want to use it to detect text on daily life products like grocery items. So can i use just a single view of that product? Thanks for helping
Also i would like to know if i am using the right model for my purpose. And if yes, how can i see the detected text , as i am only able to see bounding boxes as a result
If you want to use only one image, the FSNS model is not the model you are looking for, in fact there is no pre-trained model that matches your purpose. You'll need to develop your own.
If you want to see the predicted bbox on the image, you'll need to take the predicted bboxes and render them on the image yourself ;) should not be too difficult.
ohh okay, thanks for your help 👍
Since some datasets only have a single view of the image, would concatenating the same image four times horizontally and stretching it to match the 600x150 dimension of FSNS images still make for reasonable training data for the FSNS model?
Nope, it doesnt't make sense. You would extract the same features four times and concatenate the same features, so you will not gain any improvements.
I was trying to run fsns_demo on a random downloaded image but got this error.
Traceback (most recent call last): File "fsns_demo.py", line 153, in
predictions, crops, grids = network(image[xp.newaxis, ...])
File "/home/nandwani_vaibhav/text-detection-ctpn/see/chainer/datasets/fsns.py", line 521, in call
h = self.localization_net(images)
File "/home/nandwani_vaibhav/text-detection-ctpn/see/chainer/datasets/fsns.py", line 206, in call
lstm_prediction = F.relu(self.lstm(in_feature))
File "/home/nandwani_vaibhav/anaconda3/envs/fastai/lib/python3.6/site-packages/chainer/links/connection/lstm.py", line 309, in call
lstm_in = self.upward(x)
File "/home/nandwani_vaibhav/anaconda3/envs/fastai/lib/python3.6/site-packages/chainer/links/connection/linear.py", line 129, in call
return linear.linear(x, self.W, self.b)
File "/home/nandwani_vaibhav/anaconda3/envs/fastai/lib/python3.6/site-packages/chainer/functions/connection/linear.py", line 118, in linear
y, = LinearFunction().apply(args)
File "/home/nandwani_vaibhav/anaconda3/envs/fastai/lib/python3.6/site-packages/chainer/function_node.py", line 230, in apply
self._check_data_type_forward(in_data)
File "/home/nandwani_vaibhav/anaconda3/envs/fastai/lib/python3.6/site-packages/chainer/function_node.py", line 298, in _check_data_type_forward
self.check_type_forward(in_type)
File "/home/nandwani_vaibhav/anaconda3/envs/fastai/lib/python3.6/site-packages/chainer/functions/connection/linear.py", line 20, in check_type_forward
x_type.shape[1] == w_type.shape[1],
File "/home/nandwani_vaibhav/anaconda3/envs/fastai/lib/python3.6/site-packages/chainer/utils/type_check.py", line 524, in expect
expr.expect()
File "/home/nandwani_vaibhav/anaconda3/envs/fastai/lib/python3.6/site-packages/chainer/utils/type_check.py", line 482, in expect
'{0} {1} {2}'.format(left, self.inv, right))
chainer.utils.type_check.InvalidType:
Invalid operation is performed in: LinearFunction (Forward)
Expect: in_types[0].shape[1] == in_types[1].shape[1] Actual: 18144 != 3072
Is there any specific input size of image we should use? Or how to resolve this error?