solivr / tf-crnn

TensorFlow convolutional recurrent neural network (CRNN) for text recognition
GNU General Public License v3.0
292 stars 98 forks source link

prediction with estimator #23

Closed jzhongaa closed 6 years ago

jzhongaa commented 6 years ago

I encountered some problems when predicting using estimator with following code: ` import argparse import os, time import csv import numpy as np try: import better_exceptions except ImportError: pass from tqdm import trange import tensorflow as tf from src.model import crnn_fn from src.data_handler import data_loader from src.data_handler import preprocess_image_for_prediction

from src.config import Params, Alphabet, import_params_from_json

parameters = Params(train_batch_size=100, eval_batch_size=100, learning_rate=1e-3, # 1e-3 recommended learning_decay_rate=0.95, learning_decay_steps=5000, evaluate_every_epoch=5, save_interval=5e3, input_shape=(32, 304), optimizer='adam', digits_only=True, alphabet=Alphabet.LETTERS_DIGITS_EXTENDED, alphabet_decoding='same', csv_delimiter=';', csv_files_eval='./output_numbers/testlabels_abs.csv', csv_files_train='./output_numbers/trainlabels_abs.csv', output_model_dir='./estimator/', n_epochs=1, gpu='' ) model_params = { 'Params': parameters, }

parameters.export_experiment_params()

os.environ['CUDA_VISIBLE_DEVICES'] = parameters.gpu config_sess = tf.ConfigProto() config_sess.gpu_options.per_process_gpu_memory_fraction = 0.8 config_sess.gpu_options.allow_growth = True

est_config = tf.estimator.RunConfig() est_config.replace(keep_checkpoint_max=10, save_checkpoints_steps=parameters.save_interval, session_config=config_sess, save_checkpoints_secs=None, save_summary_steps=1000, model_dir=parameters.output_model_dir)

estimator = tf.estimator.Estimator(model_fn=crnn_fn, params=model_params, model_dir=parameters.output_model_dir, config=est_config )

pred=estimator.predict(input_fn=data_loader(csv_filename='./output_numbers/testlabels_abs.csv',params=parameters))

for i in enumerate(pred): print(i) ` The error is like:

(tf36) C:\Users\lance\Documents\Github\tf-crnn>python predict.py Traceback (most recent call last): File "predict.py", line 91, in for i,j in enumerate(a): File "C:\Users\lance\Miniconda3\envs\tf36\lib\site-packages\tensorflow\python\estimator\estimator.py", line 425, in predict for i in range(self._extract_batch_length(preds_evaluated)): File "C:\Users\lance\Miniconda3\envs\tf36\lib\site-packages\tensorflow\python\estimator\estimator.py", line 592, in _extract_batch_length 'different batch length then others.' % key) ValueError: Batch length of predictions should be same. raw_predictions has different batch length then others.

Wonder if anyone could help with it. Thanks.

ghost commented 6 years ago

I fixed the issue. Modify model.py predictions_dict = {'prob': logprob, 'raw_predictions': raw_pred, } Rename the variable predictions_dict to other and define a new variable predictions_dict. Like the following. predictions_dict ={} training_dict = {'prob': logprob, 'raw_predictions': raw_pred, }

jzhongaa commented 6 years ago

@junjie725 issue is solved. However, I found that the returned 'predict' generater will loop forever without a stop.

jzhongaa commented 6 years ago

I found that the infinite loop issue will only happen in windows environment. When I test in linux server, the infinite loop generator issue will disappear.

what2help commented 6 years ago

Hello,

I'm getting the same error but I didn't get how you solved by changing model.py file. I've replaced predictions_dict = {'prob': logprob, 'raw_predictions': raw_pred, } with predictions_dict ={} training_dict = {'prob': logprob, 'raw_predictions': raw_pred, } then I'm getting an error

File "/home/ubuntu/ctnn/tf-crnn/src/model.py", line 325, in crnn_fn sparse_code_pred, log_probability = tf.nn.ctc_beam_search_decoder(predictions_dict['prob'], KeyError: 'prob'

as predictions_dict does not have the 'prob' key anymore. I know I'm missing something so can anyone help me here.

Thanks.

ghost commented 6 years ago

@what2help You forgot to rename predictions_dict['prob'] to training_dict['prob'] and predictions_dict['raw_predictions'] to training_dict['raw_predictions'].

jzhongaa commented 6 years ago

@what2help Sorry I did not mention the details in the previous reply. The solution is that predictions_dict['prob'] should be defined the same as training_dict['prob'](as well as other items in the dict). You could refer to the below code for method crnn_fn in the model.py file.


def crnn_fn(features, labels, mode, params):
    """
    :param features: dict {
                            'images'
                            'images_widths'
                            'filenames'
                            }
    :param labels: labels. flattend (1D) array with encoded label (one code per character)
    :param mode:
    :param params: dict {
                            'Params'
                        }
    :return:
    """

    parameters = params.get('Params')
    assert isinstance(parameters, Params)

    if mode == tf.estimator.ModeKeys.TRAIN:
        parameters.keep_prob_dropout = 0.7
    else:
        parameters.keep_prob_dropout = 1.0

    conv = deep_cnn(features['images'], (mode == tf.estimator.ModeKeys.TRAIN), summaries=False)
    logprob, raw_pred = deep_bidirectional_lstm(conv, params=parameters, summaries=False)

    # Compute seq_len from image width
    n_pools = CONST.DIMENSION_REDUCTION_W_POOLING  # 2x2 pooling in dimension W on layer 1 and 2
    seq_len_inputs = tf.divide(features['images_widths'], n_pools, name='seq_len_input_op') - 1
    prediction_dict={}
    training_dict = {'prob': logprob,
                        'raw_predictions': raw_pred,
                        }
    try:
        training_dict['filenames'] = features['filenames']
        prediction_dict['filenames'] = features['filenames']
    except KeyError:
        pass

    if not mode == tf.estimator.ModeKeys.PREDICT:
        # Alphabet and codes
        keys = [c for c in parameters.alphabet]
        values = parameters.alphabet_codes

        # Convert string label to code label
        with tf.name_scope('str2code_conversion'):
            table_str2int = tf.contrib.lookup.HashTable(tf.contrib.lookup.KeyValueTensorInitializer(keys, values), -1)
            splited = tf.string_split(labels, delimiter='')  # TODO change string split to utf8 split in next tf version
            codes = table_str2int.lookup(splited.values)
            sparse_code_target = tf.SparseTensor(splited.indices, codes, splited.dense_shape)

        seq_lengths_labels = tf.bincount(tf.cast(sparse_code_target.indices[:, 0], tf.int32),
                                         minlength=tf.shape(training_dict['prob'])[1])

        # Loss
        # ----
        # >>> Cannot have longer labels than predictions -> error
        with tf.control_dependencies([tf.less_equal(sparse_code_target.dense_shape[1], tf.reduce_max(tf.cast(seq_len_inputs, tf.int64)))]):
            loss_ctc = tf.nn.ctc_loss(labels=sparse_code_target,
                                      inputs=training_dict['prob'],
                                      sequence_length=tf.cast(seq_len_inputs, tf.int32),
                                      preprocess_collapse_repeated=False,
                                      ctc_merge_repeated=True,
                                      ignore_longer_outputs_than_inputs=True,  # returns zero gradient in case it happens -> ema loss = NaN
                                      time_major=True)
            loss_ctc = tf.reduce_mean(loss_ctc)
            loss_ctc = tf.Print(loss_ctc, [loss_ctc], message='* Loss : ')

        global_step = tf.train.get_or_create_global_step()
        # # Create an ExponentialMovingAverage object
        ema = tf.train.ExponentialMovingAverage(decay=0.99, num_updates=global_step, zero_debias=True)
        # Create the shadow variables, and add op to maintain moving averages
        maintain_averages_op = ema.apply([loss_ctc])
        loss_ema = ema.average(loss_ctc)

        # Train op
        # --------
        learning_rate = tf.train.exponential_decay(parameters.learning_rate, global_step,
                                                   parameters.learning_decay_steps, parameters.learning_decay_rate,
                                                   staircase=True)

        if parameters.optimizer == 'ada':
            optimizer = tf.train.AdadeltaOptimizer(learning_rate)
        elif parameters.optimizer == 'adam':
            optimizer = tf.train.AdamOptimizer(learning_rate, beta1=0.5)
        elif parameters.optimizer == 'rms':
            optimizer = tf.train.RMSPropOptimizer(learning_rate)

        update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
        opt_op = optimizer.minimize(loss_ctc, global_step=global_step)
        with tf.control_dependencies(update_ops + [opt_op]):
            train_op = tf.group(maintain_averages_op)

        # Summaries
        # ---------
        tf.summary.scalar('learning_rate', learning_rate)
        tf.summary.scalar('losses/ctc_loss', loss_ctc)
    else:
        loss_ctc, train_op = None, None

    if mode in [tf.estimator.ModeKeys.EVAL, tf.estimator.ModeKeys.PREDICT, tf.estimator.ModeKeys.TRAIN]:
        with tf.name_scope('code2str_conversion'):
            keys = tf.cast(parameters.alphabet_decoding_codes, tf.int64)
            values = [c for c in parameters.alphabet_decoding]
            table_int2str = tf.contrib.lookup.HashTable(tf.contrib.lookup.KeyValueTensorInitializer(keys, values), '?')

            sparse_code_pred, log_probability = tf.nn.ctc_beam_search_decoder(training_dict['prob'],
                                                                              sequence_length=tf.cast(seq_len_inputs, tf.int32),
                                                                              merge_repeated=False,
                                                                              beam_width=100,
                                                                              top_paths=2)
            # Score
            training_dict['score'] = tf.subtract(log_probability[:, 0], log_probability[:, 1])
            prediction_dict['score'] = tf.subtract(log_probability[:, 0], log_probability[:, 1])
            # around 10.0 -> seems pretty sure, less than 5.0 bit unsure, some errors/challenging images
            sparse_code_pred = sparse_code_pred[0]

            sequence_lengths_pred = tf.bincount(tf.cast(sparse_code_pred.indices[:, 0], tf.int32),
                                                minlength=tf.shape(training_dict['prob'])[1])

            pred_chars = table_int2str.lookup(sparse_code_pred)
            training_dict['words'] = get_words_from_chars(pred_chars.values, sequence_lengths=sequence_lengths_pred)
            prediction_dict['words'] = get_words_from_chars(pred_chars.values, sequence_lengths=sequence_lengths_pred)

            tf.summary.text('predicted_words', training_dict['words'][:10])

    # Evaluation ops
    # --------------
    if mode == tf.estimator.ModeKeys.EVAL:
        with tf.name_scope('evaluation'):
            CER = tf.metrics.mean(tf.edit_distance(sparse_code_pred, tf.cast(sparse_code_target, dtype=tf.int64)), name='CER')

            # Convert label codes to decoding alphabet to compare predicted and groundtrouth words
            target_chars = table_int2str.lookup(tf.cast(sparse_code_target, tf.int64))
            target_words = get_words_from_chars(target_chars.values, seq_lengths_labels)
            accuracy = tf.metrics.accuracy(target_words, training_dict['words'], name='accuracy')

            eval_metric_ops = {
                               'eval/accuracy': accuracy,
                               'eval/CER': CER,
                               }
            CER = tf.Print(CER, [CER], message='-- CER : ')
            accuracy = tf.Print(accuracy, [accuracy], message='-- Accuracy : ')

    else:
        eval_metric_ops = None

    export_outputs = {'predictions': tf.estimator.export.PredictOutput(training_dict)}

    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(
            mode=mode,
            predictions=prediction_dict,
            loss=loss_ctc,
            train_op=train_op,
            eval_metric_ops=eval_metric_ops,
            export_outputs=export_outputs,
            scaffold=tf.train.Scaffold()
            # scaffold=tf.train.Scaffold(init_fn=None)  # Specify init_fn to restore from previous model
        )
    else:
        return tf.estimator.EstimatorSpec(
        mode=mode,
        predictions=training_dict,
        loss=loss_ctc,
        train_op=train_op,
        eval_metric_ops=eval_metric_ops,
        export_outputs=export_outputs,
        scaffold=tf.train.Scaffold()
        # scaffold=tf.train.Scaffold(init_fn=None)  # Specify init_fn to restore from previous model
    )
what2help commented 6 years ago

Hi @jzhongaa, thanks for your quick reply. Now inference is running but with 6 images in CSV file, it goes into an infinite loop on ubuntu.

jzhongaa commented 6 years ago

@what2help May I have a look at your code for prediction?

what2help commented 6 years ago

@jzhongaa It is same code what you provided above.

ghost commented 6 years ago

@what2help My prediction code. It works for me.

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('-ft', '--csv_files_train',  type=str, help='CSV filename for training',
                        nargs='*', default='./frank_train.csv')
    parser.add_argument('-fe', '--csv_files_eval', type=str, help='CSV filename for evaluation',
                        nargs='*', default='./image_test.csv')
    parser.add_argument('-o', '--output_model_dir',  type=str,
                        help='Directory for output', default='./estimator')
    parser.add_argument('-n', '--nb-epochs', type=int, default=30, help='Number of epochs')
    parser.add_argument('-g', '--gpu', type=str, help="GPU 0,1 or '' ", default='0')
    parser.add_argument('-p', '--params-file', type=str, help='Parameters filename', default=None)
    args = vars(parser.parse_args())

    if args.get('params_file'):
        dict_params = import_params_from_json(json_filename=args.get('params_file'))
        parameters = Params(**dict_params)
    else:
        parameters = Params(train_batch_size=128,
                            eval_batch_size=128,
                            learning_rate=1e-3,  # 1e-3 recommended
                            learning_decay_rate=0.95,
                            learning_decay_steps=5000,
                            evaluate_every_epoch=5,
                            save_interval=5e3,
                            input_shape=(32, 304),
                            optimizer='adam',
                            digits_only=False,
                            alphabet=Alphabet.LETTERS_DIGITS_EXTENDED,
                            alphabet_decoding='same',
                            csv_delimiter=' ',
                            csv_files_eval=args.get('csv_files_eval'),
                            csv_files_train=args.get('csv_files_train'),
                            output_model_dir=args.get('output_model_dir'),
                            n_epochs=args.get('nb_epochs'),
                            gpu=args.get('gpu')
                            )

    model_params = {
        'Params': parameters,
    }

    parameters.export_experiment_params()

    os.environ['CUDA_VISIBLE_DEVICES'] = parameters.gpu
    config_sess = tf.ConfigProto()
    config_sess.gpu_options.per_process_gpu_memory_fraction = 0.8
    config_sess.gpu_options.allow_growth = True

    # Config estimator
    est_config = tf.estimator.RunConfig()
    est_config.replace(keep_checkpoint_max=10,
                       save_checkpoints_steps=parameters.save_interval,
                       session_config=config_sess,
                       save_checkpoints_secs=None,
                       save_summary_steps=1000,
                       model_dir=parameters.output_model_dir)

    estimator = tf.estimator.Estimator(model_fn=crnn_fn,
                                       params=model_params,
                                       model_dir=parameters.output_model_dir,
                                       )

    # Count number of image filenames in csv
    #n_samples = 0
    #csvfile=open(parameters.csv_files_eval, 'r', encoding='utf8')
    #reader = csv.reader(csvfile)
    predictResults = estimator.predict(input_fn =data_loader(csv_filename=parameters.csv_files_eval,
                                                 params=parameters,
                                                 batch_size=1,
                                                 num_epochs=1,
                                                 data_augmentation=False,
                                                 image_summaries=False))
    for i, prediction in enumerate(predictResults):
    #    print("Prediction %s: %s" % (i + 1, prediction))
        print(prediction["words"])
jzhongaa commented 6 years ago

@what2help Below code works for me.


import os, time
try:
    import better_exceptions
except ImportError:
    pass

import tensorflow as tf
from src.model import crnn_fn
from src.data_handler import data_loader

from src.config import Params, Alphabet

target_folder='./output_numbers/test.csv'
model_folder='./estimator/'

with open(target_folder, 'r', encoding='utf8') as csvfile:
    n_samples =len(csvfile.readlines())

batch_size=128
#if n_samples<5000:
#    batch_size=n_samples

parameters = Params(train_batch_size=128,
                        eval_batch_size=batch_size,
                        learning_rate=1e-3,  # 1e-3 recommended
                        learning_decay_rate=0.95,
                        learning_decay_steps=5000,
                        evaluate_every_epoch=5,
                        save_interval=5e3,
                        input_shape=(32, 304),
                        optimizer='adam',
                        digits_only=True,
                        alphabet=Alphabet.LETTERS_DIGITS_EXTENDED,
                        alphabet_decoding='same',
                        csv_delimiter=';',
                        csv_files_eval='./output_numbers/testlabels_abs.csv',
                        csv_files_train='./output_numbers/trainlabels_abs.csv',
                        output_model_dir=model_folder,
                        n_epochs=30,
                        gpu=''
                        )

model_params = {
    'Params': parameters,
}

parameters.export_experiment_params()

os.environ['CUDA_VISIBLE_DEVICES'] = parameters.gpu
config_sess = tf.ConfigProto()
config_sess.gpu_options.per_process_gpu_memory_fraction = 0.8
config_sess.gpu_options.allow_growth = False

# Config estimator
est_config = tf.estimator.RunConfig()
est_config.replace(keep_checkpoint_max=10,
                   save_checkpoints_steps=parameters.save_interval,
                   session_config=config_sess,
                   save_checkpoints_secs=None,
                   save_summary_steps=1000,
                   model_dir=parameters.output_model_dir)

estimator = tf.estimator.Estimator(model_fn=crnn_fn,
                                   params=model_params,
                                   model_dir=parameters.output_model_dir,
                                   config=est_config
                                   )

t1=time.time()
result=(estimator.predict(input_fn=data_loader(csv_filename=target_folder,
                                                params=parameters,
                                                batch_size=min(n_samples, parameters.eval_batch_size),
                                                num_epochs=1)))
result_list=[]
for i,j in enumerate(result):
    result_list.append(j)
print(time.time()-t1, n_samples)

print(sorted(result_list, key=lambda result_list:result_list['filenames']))
what2help commented 6 years ago

Thanks, @jzhongaa and @junjie725. Both the codes are working.

I've one more question related to prediction. When I'm trying to do prediction with an exported model, it takes only one image, not in batch. I tried changing that export file so that I can make that input placeholder to take images in batch. But I get some error in the model.py file. Did you try running images in batch with the exported model? Thanks.

jzhongaa commented 6 years ago

@what2help Encountered same problem here with exported model. I used estimator for prediction with images in batch.