infocusp / tf_cnnvis

CNN visualization tool in TensorFlow
MIT License
781 stars 208 forks source link

can't utilize GPU to accelerate computation. #55

Open yb70 opened 6 years ago

yb70 commented 6 years ago

Hi, I tried to visualize the InceptionV4 model layers and fed a MRI image to it, everything works well except that the GPU seems did not involved in computing. TensorFlow do allocate graphic memory for the process but the GPU utilization rate is at 0%. Most of time only one CPU core is working. How can I utilize GPU to accelerate? below are my codes.

` import tensorflow as tf from tf_cnnvis import * from nets.inception_v4 import inception_v4_base, inception_v4_arg_scope import matplotlib.image as mpimg import numpy as np

slim = tf.contrib.slim

if name == 'main': X = tf.placeholder(tf.float32, [None, 160, 160, 3]) img = mpimg.imread('data/image.png') img = img[42:202, 3:163] img = np.stack([img, img, img], axis=2) img = np.reshape(img, [1, 160, 160, 3])

with slim.arg_scope(inception_v4_arg_scope()):
    net_out = inception_v4_base(inputs=X)

t_vars = tf.trainable_variables()
IV4_vars = [var for var in t_vars if var.name.startswith('InceptionV4')]

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    saver = tf.train.Saver(var_list=IV4_vars)
    saver.restore(sess,
                  'checkpoint/inception_v4_2016_09_09/inception_v4.ckpt')

    _ = deconv_visualization(
        sess_graph_path=sess,
        value_feed_dict={X: img},
        layers=['r', 'p', 'c'],
        path_logdir='summary/cnnvis/log',
        path_outdir='summary/cnnvis/out')

`

Thanks for the excellent work : )

jidebingfeng commented 6 years ago

I have the same problem!

Reconstruction Completed for FirstStageFeatureExtractor/resnet_v1_101/resnet_v1_101/conv1/Conv2D layer. Time taken = 7.082620 s
Reconstruction Completed for FirstStageFeatureExtractor/resnet_v1_101/resnet_v1_101/block3/unit_1/bottleneck_v1/conv1/Conv2D layer. Time taken = 123.632443 s
Reconstruction Completed for FirstStageFeatureExtractor/resnet_v1_101/resnet_v1_101/block3/unit_9/bottleneck_v1/conv2/Conv2D layer. Time taken = 111.493410 s
Reconstruction Completed for FirstStageFeatureExtractor/resnet_v1_101/resnet_v1_101/block3/unit_17/bottleneck_v1/conv3/Conv2D layer. Time taken = 602.894801 s
falaktheoptimist commented 6 years ago

Here is how we have currently compute deconvolution: For each layer for which we want the deconvolution output, we have 1 forward pass. Then, we have 8 parallel backward passes for deconvolution of 8 channels in a layer happening simultaneously. Then, the next 8 feature maps and so on. This value 8 was chosen to account for memory limitations. Also, there is no particular learning happening here - so exploitation of GPU computations for optimizer like repeated operations on same data won't be possible as in the case of learning weights. Do suggest if you see some other possible solutions.

yb70 commented 6 years ago

OK, thanks for the reply.

Falak notifications@github.com 于2018年8月14日周二 下午2:28写道:

Here is how we have currently compute deconvolution: For each layer for which we want the deconvolution output, we have 1 forward pass. Then, we have 8 parallel backward passes for deconvolution of 8 channels in a layer happening simultaneously. Then, the next 8 feature maps and so on. This value 8 was chosen to account for memory limitations. Also, there is no particular learning happening here - so exploitation of GPU computations for optimizer like repeated operations on same data won't be possible as in the case of learning weights. Do suggest if you see some other possible solutions.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/InFoCusp/tf_cnnvis/issues/55#issuecomment-412768113, or mute the thread https://github.com/notifications/unsubscribe-auth/AfDgshYuRuA3nbr0cUkPZfL_usAsIcFlks5uQm4IgaJpZM4VD5V8 .