solivr / tf-crnn

TensorFlow convolutional recurrent neural network (CRNN) for text recognition
GNU General Public License v3.0
292 stars 98 forks source link

Error when trying to run training.py with mnist dataset #58

Closed BryanOrabutt closed 4 years ago

BryanOrabutt commented 4 years ago

I am getting the following error when I try to train a model on the minst dataset.

INFO - crnn - Running command 'training'
INFO - crnn - Started
ERROR - crnn - Failed after 0:00:00!
Traceback (most recent calls WITHOUT Sacred internals):
  File "training.py", line 43, in training
    n_samples_train, n_samples_eval = data_preprocessing(parameters)
  File "/home/ubuntu/github/tf-crnn/tf_crnn/preprocessing.py", line 144, in data_preprocessing
    n_samples_train = preprocess_csv(params.csv_files_train, params, csv_train_output)
  File "/home/ubuntu/github/tf-crnn/tf_crnn/preprocessing.py", line 81, in preprocess_csv
    dataframe['label_string'] = dataframe.labels.apply(lambda x: re.sub(re.escape(parameters.string_split_delimiter), '', x))
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/pandas/core/series.py", line 3591, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas/_libs/lib.pyx", line 2217, in pandas._libs.lib.map_infer
  File "/home/ubuntu/github/tf-crnn/tf_crnn/preprocessing.py", line 81, in <lambda>
    dataframe['label_string'] = dataframe.labels.apply(lambda x: re.sub(re.escape(parameters.string_split_delimiter), '', x))
  File "/home/ubuntu/anaconda3/lib/python3.6/re.py", line 191, in sub
    return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object

It seems to be failing to parse my csv files but I believe my file format is correct. My training data csv file is in this format (truncated here for obvious reasons):

./mnist_png/training/0/27209.png;0
./mnist/training/0/57120.png;0
./mnist/training/0/56239.png;0
./mnist/training/0/56908.png;0
./mnist/training/0/42084.png;0
./mnist/training/0/49065.png;0
./mnist/training/0/29559.png;0
./mnist/training/0/57356.png;0
./mnist/training/0/39531.png;0

My test dataset csv file is in the same format. And finally my config.json file is shown below:

  "lookup_alphabet_file" : "/home/ubuntu/github/tf-crnn/data/alphabet/lookup_digits.json",
  "csv_files_train" : "/home/ubuntu/github/tf-crnn/data/csv/mnist_test.csv",
  "csv_files_eval" : "/home/ubuntu/gitibu/tf-crnn/data/csv/mnist_train.csv",
  "output_model_dir" : "./output_model_mnist_png",
  "num_beam_paths" : 1,
  "cnn_batch_norm" : [true, true, true, true, true],
  "max_chars_per_string" : 50,
  "n_epochs" : 200,
  "train_batch_size" : 350,
  "eval_batch_size" : 350,
  "learning_rate": 1e-5,
  "input_shape" : [32, 816],
  "rnn_units" : [256, 256, 256],
  "restore_model" : false
}

I had this working a few months ago with an older version of tf-crnn. However, I recently moved all of my work to AWS machines and decided to use the newest version but now I can not get it to train a model.

BryanOrabutt commented 4 years ago

Resolved. The label strings were being detected as integers. Typecasting to string directly fixed the problem.