MaybeShewill-CV / CRNN_Tensorflow

Convolutional Recurrent Neural Networks(CRNN) for Scene Text Recognition
MIT License
1.03k stars 388 forks source link

multi gpu for test accuracy #398

Closed zgsxwsdxg closed 4 years ago

zgsxwsdxg commented 4 years ago

您好,作者!我想请教下你对于多gpu训练,训练的每个epoch过后,测试准确率,代码在那个地方修改比较合适?谢谢!

MaybeShewill-CV commented 4 years ago

@zgsxwsdxg 参考单gpu训练的函数吧 https://github.com/MaybeShewill-CV/CRNN_Tensorflow/blob/0c352335471088714586a0b11cf0d7818226dcf0/tools/train_shadownet.py#L290-L299 :)

zgsxwsdxg commented 4 years ago

@zgsxwsdxg 参考单gpu训练的函数吧

  CRNN_Tensorflow/tools/train_shadownet.py

    Lines 290 to 299
  in
  0c35233

       if need_decode and epoch % 500 == 0: 

           # train part 

           _, train_ctc_loss_value, train_seq_dist_value, \ 

               train_predictions, train_labels_sparse, merge_summary_value = sess.run( 

                [optimizer, train_ctc_loss, train_sequence_dist, 

                 train_decoded, train_labels, merge_summary_op]) 

           train_labels_str = decoder.sparse_tensor_to_str(train_labels_sparse) 

           train_predictions = decoder.sparse_tensor_to_str(train_predictions[0]) 

           avg_train_accuracy = evaluation_tools.compute_accuracy(train_labels_str, train_predictions) 

:)

您好,我是这么做的,测试数据在每个卡上都预测,用val_infer_rets保存每个卡的预测结果,但是我怎么才能sess.run() 这个val_infer_rets ? 同时,val_lables 怎么在run的阶段获取到( 因为我要将预测和val_labels比对) `

在这儿定义 list类型的 val_infer_rets用来保存每个卡预测结果

val_infer_rets = [] with tf.variable_scope(tf.get_variable_scope()): is_network_initialized = False for i in range(CFG.TRAIN.GPU_NUM): with tf.device('/gpu:{:d}'.format(i)): with tf.namescope('tower{:d}'.format(i)) as _: train_images = train_samples[i][0] train_labels = train_samples[i][1] trainloss, grads, = compute_net_gradients( train_images, train_labels, shadownet, optimizer, is_net_first_initialized=is_network_initialized)

                is_network_initialized = True

                # Only use the mean and var in the first gpu tower to update the parameter
                if i == 0:
                    batchnorm_updates = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
                    train_summary_op_updates = tf.get_collection(tf.GraphKeys.SUMMARIES)

                tower_grads.append(grads)
                train_tower_loss.append(train_loss)
            with tf.name_scope('validation_{:d}'.format(i)) as _:
                val_images = val_samples[i][0]
                val_labels = val_samples[i][1]
                val_loss, _,infer_ret = compute_net_gradients(
                    val_images, val_labels, shadownet_val, optimizer,
                    is_net_first_initialized=is_network_initialized)
                val_tower_loss.append(val_loss)
                # 在这儿,我添加下如下代代码,保存每个卡的预测结果
                val_infer_rets.append(infer_ret)

`

MaybeShewill-CV commented 4 years ago

@zgsxwsdxg 保存每个卡上的infer ret然后单独执行sess run就可以获取每个卡上的预测结果 然后decode一下再跟你的label做对比就好了:)