Bartzi / stn-ocr

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition
https://arxiv.org/abs/1707.08831
GNU General Public License v3.0
499 stars 137 forks source link

Shape error eval_svhn_model.py for SVHN demos. #23

Closed sorelyss closed 5 years ago

sorelyss commented 6 years ago

Hi, I was trying to run your demos but I only make it to work for the original_svhn model, I also tried to train one by myself but at the end it raises the same size error.

When I do:

python eval_svhn_model.py ../datasets/svhn/models/original_svhn/models/model 40 ../datasets/svhn/evaluation/test.csv ../datasets/svhn/svhn_char_map.json

Works perfect.

However, when I try:

python eval_svhn_model.py ../datasets/svhn/models/regular_grid/model 19 ../datasets/svhn/evaluation/test.csv ../datasets/svhn/svhn_char_map.json

It raises the following error, I have tried to pass a different --input-width and --input-height but it seems that the problem is not there.

[16:54:45] src/nnvm/legacy_json_util.cc:153: Loading symbol saved by previous version v0.8.0. Attempting to upgrade...
[16:54:45] /home/sorelyss/Documents/test/incubator-mxnet/dmlc-core/include/dmlc/./logging.h:300: [16:54:45] src/ndarray/ndarray.cc:239: Check failed: from.shape() == to->shape() operands shape mismatchfrom.shape = (48,48,3,3) to.shape=(64,64,3,3)

Stack trace returned 25 entries:
[bt] (0) /home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7fbe48041d6c]
[bt] (1) /home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet10CopyFromToERKNS_7NDArrayEPS0_i+0x437) [0x7fbe48832997]
[bt] (2) /home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(+0x9d853a) [0x7fbe487cf53a]
[bt] (3) /home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(MXImperativeInvoke+0x1034) [0x7fbe48aca674]
[bt] (4) /home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/_cy3/ndarray.cpython-35m-x86_64-linux-gnu.so(+0x1312c) [0x7fbe3b8d512c]
[bt] (5) /home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/_cy3/ndarray.cpython-35m-x86_64-linux-gnu.so(+0x140ed) [0x7fbe3b8d60ed]
[bt] (6) python(PyObject_Call+0x47) [0x5c1797]
[bt] (7) python(PyEval_EvalFrameEx+0x4ec6) [0x53bba6]
[bt] (8) python(PyEval_EvalFrameEx+0x4b04) [0x53b7e4]
[bt] (9) python() [0x5406df]
[bt] (10) python(PyEval_EvalFrameEx+0x54f0) [0x53c1d0]
[bt] (11) python() [0x5406df]
[bt] (12) python(PyEval_EvalFrameEx+0x50b2) [0x53bd92]
[bt] (13) python() [0x540199]
[bt] (14) python(PyEval_EvalFrameEx+0x50b2) [0x53bd92]
[bt] (15) python(PyEval_EvalFrameEx+0x4b04) [0x53b7e4]
[bt] (16) python() [0x540199]
[bt] (17) python(PyEval_EvalCode+0x1f) [0x540e4f]
[bt] (18) python() [0x60c272]
[bt] (19) python(PyRun_FileExFlags+0x9a) [0x60e71a]
[bt] (20) python(PyRun_SimpleFileExFlags+0x1bc) [0x60ef0c]
[bt] (21) python(Py_Main+0x456) [0x63fb26]
[bt] (22) python(main+0xe1) [0x4cfeb1]
[bt] (23) /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7fbe53795830]
[bt] (24) python(_start+0x29) [0x5d6049]

Traceback (most recent call last):
  File "eval_svhn_model.py", line 109, in <module>
    model = get_model(args, data_shape, output_size)
  File "eval_svhn_model.py", line 58, in get_model
    model.set_params(arg_params, aux_params)
  File "/home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/module/base_module.py", line 557, in set_params
    allow_missing=allow_missing, force_init=force_init)
  File "/home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/module/module.py", line 261, in init_params
    _impl(name, arr, arg_params)
  File "/home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/module/module.py", line 251, in _impl
    cache_arr.copyto(arr)
  File "/home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/ndarray.py", line 556, in copyto
    return _internal._copyto(self, out=other)
  File "mxnet/cython/ndarray.pyx", line 167, in ndarray._make_ndarray_function.generic_ndarray_function
  File "mxnet/cython/./base.pyi", line 36, in ndarray.CALL
mxnet.base.MXNetError: b'[16:54:45] src/ndarray/ndarray.cc:239: Check failed: from.shape() == to->shape() operands shape mismatchfrom.shape = (48,48,3,3) to.shape=(64,64,3,3)\n\nStack trace returned 25 entries:\n[bt] (0) /home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7fbe48041d6c]\n[bt] (1) /home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet10CopyFromToERKNS_7NDArrayEPS0_i+0x437) [0x7fbe48832997]\n[bt] (2) /home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(+0x9d853a) [0x7fbe487cf53a]\n[bt] (3) /home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/../../lib/libmxnet.so(MXImperativeInvoke+0x1034) [0x7fbe48aca674]\n[bt] (4) /home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/_cy3/ndarray.cpython-35m-x86_64-linux-gnu.so(+0x1312c) [0x7fbe3b8d512c]\n[bt] (5) /home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/_cy3/ndarray.cpython-35m-x86_64-linux-gnu.so(+0x140ed) [0x7fbe3b8d60ed]\n[bt] (6) python(PyObject_Call+0x47) [0x5c1797]\n[bt] (7) python(PyEval_EvalFrameEx+0x4ec6) [0x53bba6]\n[bt] (8) python(PyEval_EvalFrameEx+0x4b04) [0x53b7e4]\n[bt] (9) python() [0x5406df]\n[bt] (10) python(PyEval_EvalFrameEx+0x54f0) [0x53c1d0]\n[bt] (11) python() [0x5406df]\n[bt] (12) python(PyEval_EvalFrameEx+0x50b2) [0x53bd92]\n[bt] (13) python() [0x540199]\n[bt] (14) python(PyEval_EvalFrameEx+0x50b2) [0x53bd92]\n[bt] (15) python(PyEval_EvalFrameEx+0x4b04) [0x53b7e4]\n[bt] (16) python() [0x540199]\n[bt] (17) python(PyEval_EvalCode+0x1f) [0x540e4f]\n[bt] (18) python() [0x60c272]\n[bt] (19) python(PyRun_FileExFlags+0x9a) [0x60e71a]\n[bt] (20) python(PyRun_SimpleFileExFlags+0x1bc) [0x60ef0c]\n[bt] (21) python(Py_Main+0x456) [0x63fb26]\n[bt] (22) python(main+0xe1) [0x4cfeb1]\n[bt] (23) /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7fbe53795830]\n[bt] (24) python(_start+0x29) [0x5d6049]\n'
Bartzi commented 6 years ago

Yeah, that is because the file is meant to be used with the model trained on original svhn.

But, you should be able to use the script for the other models as well. You will need to change some parts of the code.

And I think you should be good to go... at least I hope so...

sorelyss commented 6 years ago

Hi, thank you for the response. It raises this error. What is the set I have to put for l1_forward_init_h_state, l0_forward_init_h_state, etc?

infer_shape error. Arguments:
  data: (1, 1, 185, 185)
  l1_forward_init_h_state: (1, 1, 256)
  l0_forward_init_h_state: (1, 1, 256)
  softmax_label: (1, 3)
  l0_forward_init_c_state_cell: (1, 1, 256)
  l1_forward_init_c_state_cell: (1, 1, 256)
Traceback (most recent call last):
  File "eval_svhn_model.py", line 110, in <module>
    model = get_model(args, data_shape, output_size)
  File "eval_svhn_model.py", line 51, in get_model
    grad_req='null'
  File "/home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/module/module.py", line 344, in bind
    grad_req=grad_req, input_types=input_types)
  File "/home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/module/executor_group.py", line 193, in __init__
    self.bind_exec(data_shapes, label_shapes, shared_group)
  File "/home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/module/executor_group.py", line 232, in bind_exec
    self.execs.append(self._bind_ith_exec(i, data_shapes, label_shapes, shared_group))
  File "/home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/module/executor_group.py", line 454, in _bind_ith_exec
    arg_shapes, _, aux_shapes = self.symbol.infer_shape(**input_shapes)
  File "/home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/symbol.py", line 535, in infer_shape
    return self._infer_shape_impl(False, *args, **kwargs)
  File "/home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/symbol.py", line 602, in _infer_shape_impl
    ctypes.byref(complete)))
  File "/home/sorelyss/Documents/test/incubator-mxnet/python/mxnet/base.py", line 75, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: Error in operator warpctc0: Shape inconsistent, Provided=(3,), inferred shape=(1,)
Bartzi commented 6 years ago

This error is one of the error I hate the most about MXNet... Short answer: hard to say, but it does not have to do anything with the lstm init state arrays.

My guess is that the softmax label input shape is wrong and it should be something like (1, 1) You could have a look at this by using the SymbolDoc Module of MXNet. This helps to debug shape errors.

sorelyss commented 6 years ago

It didn't work. I think i should try training the original data. However I think the train_svhn.py file has other parameters. I though putting something like:

    image_size = Size(width=64, height=64)
    source_shape = (args.batch_size, 1, image_size.height, image_size.width)
    target_shape = Size(width=40, height=40)

    # adjustable network parameters
    num_timesteps = 11
    labels_per_timestep = 3
    num_rnn_layers = 2
    label_width = num_timesteps // labels_per_timestep
    use_blstm = False

    net, loc, transformed_output, size_params = SVHNMultiLineCTCNetwork.get_network(
        source_shape,
        target_shape,
        num_timesteps,
        num_rnn_layers,
        label_width,
        blstm=use_blstm,
        fix_loc=args.fix_loc
    )

would work (I tried to replicate the parameters in the evaluation file), but this also didn't. It raises:

Traceback (most recent call last):
  File "train_svhn2.py", line 105, in <module>
    first_batch = next(iter(val_iter))
  File "/home/sorelyss/Documents/test/stn-ocr/mxnet/data_io/lstm_iter.py", line 23, in __next__
    return self.next()
  File "/home/sorelyss/Documents/test/stn-ocr/mxnet/data_io/lstm_iter.py", line 80, in next
    iter_batch = self.iter.next()
  File "/home/sorelyss/Documents/test/stn-ocr/mxnet/data_io/file_iter.py", line 121, in next
    raise StopIteration
StopIteration

Can I have the training file for the original data?

Bartzi commented 6 years ago

Well, this is definitely a problem, I'm sorry about that... Just to be clear: You want to train a model on the regular grid dataset, right?

You will need to set the parameters to the following:

    image_size = Size(width=185, height=185)
    source_shape = (args.batch_size, 1, image_size.height, image_size.width)
    target_shape = Size(width=50, height=50)

    # adjustable network parameters
    num_timesteps = 4
    labels_per_timestep = 4
    num_rnn_layers = 1
    label_width = num_timesteps * labels_per_timestep
    use_blstm = False

This should be correct settings for training such a network.

Regarding your Error: There is something going wrong in the data loading Code. You can have a look at the function _load_worker here and do some debugging here. It could be an image loading exception or the calculated number of labels does not match the actual number of labels per image.

sorelyss commented 6 years ago

No, I want to train the one called "original_svhn" because those models work with the evaluation file. I think is the generated centered dataset, right?

I can train using the train_svhn.py file, but this has another model's parameters.

python train_svhn.py ../datasets/svhn/generated/centered/train2.csv ../datasets/svhn/generated/centered/valid2.csv --log-dir ../logs --save-model-prefix svhn_train_model -b 2 --lr 1e-5 --zoom 0.5 -ci 500 --char-map ../datasets/svhn/svhn_char_map.json

I want to know the parameters for the "original_svhn".

Bartzi commented 6 years ago

If you want to train on original SVHN data, you will:

  1. get the dataset from here
  2. extract the dataset with this script (For usage read here).
  3. create correct svhn crops with this script. (For usage read here)
  4. Once you are done with this, you have to change the parameters to:

    image_size = Size(width=64, height=64)
    source_shape = (args.batch_size, 1, image_size.height, image_size.width)
    target_shape = Size(width=40, height=40)
    
    # adjustable network parameters
    num_timesteps = 5
    labels_per_timestep = 1
    num_rnn_layers = 1
    label_width = num_timesteps * labels_per_timestep
    use_blstm = False
  5. train the network with the generated csv file and the adjusted parameters

This should work... at least I hope so...

sorelyss commented 5 years ago

Thanks