Closed zgsxwsdxg closed 4 years ago
@zgsxwsdxg 参考单gpu训练的函数吧
CRNN_Tensorflow/tools/train_shadownet.py Lines 290 to 299 in 0c35233 if need_decode and epoch % 500 == 0: # train part _, train_ctc_loss_value, train_seq_dist_value, \ train_predictions, train_labels_sparse, merge_summary_value = sess.run( [optimizer, train_ctc_loss, train_sequence_dist, train_decoded, train_labels, merge_summary_op]) train_labels_str = decoder.sparse_tensor_to_str(train_labels_sparse) train_predictions = decoder.sparse_tensor_to_str(train_predictions[0]) avg_train_accuracy = evaluation_tools.compute_accuracy(train_labels_str, train_predictions)
:)
您好,我是这么做的,测试数据在每个卡上都预测,用val_infer_rets保存每个卡的预测结果,但是我怎么才能sess.run() 这个val_infer_rets ? 同时,val_lables 怎么在run的阶段获取到( 因为我要将预测和val_labels比对) `
val_infer_rets = [] with tf.variable_scope(tf.get_variable_scope()): is_network_initialized = False for i in range(CFG.TRAIN.GPU_NUM): with tf.device('/gpu:{:d}'.format(i)): with tf.namescope('tower{:d}'.format(i)) as _: train_images = train_samples[i][0] train_labels = train_samples[i][1] trainloss, grads, = compute_net_gradients( train_images, train_labels, shadownet, optimizer, is_net_first_initialized=is_network_initialized)
is_network_initialized = True
# Only use the mean and var in the first gpu tower to update the parameter
if i == 0:
batchnorm_updates = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
train_summary_op_updates = tf.get_collection(tf.GraphKeys.SUMMARIES)
tower_grads.append(grads)
train_tower_loss.append(train_loss)
with tf.name_scope('validation_{:d}'.format(i)) as _:
val_images = val_samples[i][0]
val_labels = val_samples[i][1]
val_loss, _,infer_ret = compute_net_gradients(
val_images, val_labels, shadownet_val, optimizer,
is_net_first_initialized=is_network_initialized)
val_tower_loss.append(val_loss)
# 在这儿,我添加下如下代代码,保存每个卡的预测结果
val_infer_rets.append(infer_ret)
`
@zgsxwsdxg 保存每个卡上的infer ret然后单独执行sess run就可以获取每个卡上的预测结果 然后decode一下再跟你的label做对比就好了:)
您好,作者!我想请教下你对于多gpu训练,训练的每个epoch过后,测试准确率,代码在那个地方修改比较合适?谢谢!