weinman / cnn_lstm_ctc_ocr

Tensorflow-based CNN+LSTM trained with CTC-loss for OCR
GNU General Public License v3.0
497 stars 170 forks source link

The train error #14

Closed SongyiGao closed 5 years ago

SongyiGao commented 6 years ago

When I train on my data. There is a error! Please can any one suggest me? 2017-12-10 13:28:41.796273: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties: name: TITAN X (Pascal) major: 6 minor: 1 memoryClockRate(GHz): 1.531 pciBusID: 0000:00:06.0 totalMemory: 11.90GiB freeMemory: 11.76GiB 2017-12-10 13:28:41.796331: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: TITAN X (Pascal), pci bus id: 0000:00:06.0, compute capability: 6.1) INFO:tensorflow:Starting standard services. INFO:tensorflow:Saving checkpoint to path ../data/model/model.ckpt INFO:tensorflow:Starting queue runners. INFO:tensorflow:global_step/sec: 0 2017-12-10 13:28:47.390354: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Tried to explicitly squeeze dimension 1 but dimension was not 1: 2 [[Node: convnet/features = Squeeze[T=DT_FLOAT, squeeze_dims=[1], _device="/job:localhost/replica:0/task:0/device:GPU:0"](convnet/pool8/MaxPool)]] 2017-12-10 13:28:47.390538: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Tried to explicitly squeeze dimension 1 but dimension was not 1: 2 [[Node: convnet/features = Squeeze[T=DT_FLOAT, squeeze_dims=[1], _device="/job:localhost/replica:0/task:0/device:GPU:0"](convnet/pool8/MaxPool)]] 2017-12-10 13:28:47.390862: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Tried to explicitly squeeze dimension 1 but dimension was not 1: 2 [[Node: convnet/features = Squeeze[T=DT_FLOAT, squeeze_dims=[1], _device="/job:localhost/replica:0/task:0/device:GPU:0"](convnet/pool8/MaxPool)]]

weinman commented 6 years ago

The code assumes the input is 32 pixels high (and grayscale). It can't squeeze the row dim out if your images are of a different height.

riteshkumartbz commented 5 years ago

What changes should I make if I have training data 64 pixels high? Also if I have colored images is there any way I can give these images as training data?

weinman commented 5 years ago
  1. You need to decide how you want to handle the extra data. One possibility may be adding another conv/pool layer (e.g. conv9, pool9) or perhaps just conv9. In the current model, pool8 is [N,1,W,512]. If you put 64 pixel high images in, you'll end up with a feature height >1. You could simply increase the max pool window of pool8, or add a conv9 to take the result own to 1 before the squeeze to create features, which should result in [N,W,512].
  2. Strictly speaking, the definition of the model (in model.py) should work with NHW3 data. You'd need to make any changes to the other parts of the code that insist on NHW1 data. I haven't tested this, but it may do the trick to simply delete this line.
SnehalRaj commented 5 years ago

Also, what changes should I make if the length of the sequences I detect are going to be constant?