kwotsin / TensorFlow-ENet

TensorFlow implementation of ENet
MIT License
257 stars 123 forks source link

please help me, tell me how to predict the video frame #27

Closed electronicYH closed 6 years ago

electronicYH commented 6 years ago

I modefied the code file:predict_segmentation.py to predict the video frame, but there are problems, please help me , Thanks ! here is the error: +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ python pre_video.py /home/yh/.conda/envs/yh/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type. from ._conv import register_converters as _register_converters Traceback (most recent call last): File "pre_video.py", line 122, in skip_connections=skip_connections) File "/home/yh/Work/TensorFlow-ENet/enet.py", line 464, in ENet pooling_indices=pooling_indices_2, output_shape=inputs_shape_2, scope=bottleneck_scope_name+'_0') File "/home/yh/.conda/envs/yh/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 183, in func_with_args return func(*args, *current_args) File "/home/yh/Work/TensorFlow-ENet/enet.py", line 321, in bottleneck net_unpool = unpool(net_unpool, pooling_indices, output_shape=output_shape, scope='unpool') File "/home/yh/Work/TensorFlow-ENet/enet.py", line 101, in unpool y = mask // (output_shape[2] output_shape[3]) TypeError: unsupported operand type(s) for *: 'NoneType' and 'int' +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

here is my modefied code:

import tensorflow as tf import os import matplotlib.pyplot as plt from enet import ENet, ENet_arg_scope from preprocessing import preprocess from scipy.misc import imsave import numpy as np import cv2 import time

slim = tf.contrib.slim

指定GPU

os.environ["CUDA_VISIBLE_DEVICES"]="1"

W = 500 H = 400

batch_size = 1 num_classes = 2

path = '/home/yh/data/dataset/zhijia/' filename = 'day.avi' #'20170920_124050_872.avi' cap = cv2.VideoCapture(path+filename)

size=(W,H)

fourcc = cv2.VideoWriter_fourcc('M','J','P','G') #opencv3.0

checkpoint_dir = "./checkpoint_mfb" checkpoint = tf.train.latest_checkpoint(checkpoint_dir)

num_initial_blocks = 1 skip_connections = False stage_two_repeat = 2 '''

Labels to colours are obtained from here:

https://github.com/alexgkendall/SegNet-Tutorial/blob/c922cc4a4fcc7ce279dd998fb2d4a8703f34ebd7/Scripts/test_segmentation_camvid.py

However, the road_marking class is collapsed into the road class in the dataset provided.

Classes:

Sky = [128,128,128] Building = [128,0,0] Pole = [192,192,128] Road_marking = [255,69,0] Road = [128,64,128] Pavement = [60,40,222] Tree = [128,128,0] SignSymbol = [192,128,128] Fence = [64,64,128] Car = [64,0,128] Pedestrian = [64,64,0] Bicyclist = [0,128,192] Unlabelled = [0,0,0] ''' label_to_colours = {0: [0,0,0], 1: [0,128,0], 2: [192,192,128], 3: [128,64,128], 4: [60,40,222], 5: [128,128,0], 6: [192,128,128], 7: [64,64,128], 8: [64,0,128], 9: [64,64,0], 10: [0,128,192], 11: [128,128,128]}

Create the photo directory

photo_dir = checkpoint_dir + "/test_images" if not os.path.exists(photo_dir): os.mkdir(photo_dir)

def vis_segmentation(image, seg_map): """Visualizes input image, segmentation map and overlay view."""

image_width, image_height = image.size
colored_label = label_to_color_image(seg_map).astype(np.uint8)
image_empty = np.zeros((image_height,2*image_width,3),np.uint8)
image_empty[:image_height,:image_width] = image.copy()
image_empty[:image_height,image_width:] = colored_label.copy()
image_empty[:image_height,:image_width] = image.copy()
image_empty[:image_height,image_width:] = colored_label.copy()

alpha = 0.35
beta = 1-alpha
gamma = 0
img_add = cv2.addWeighted(np.array(image), alpha, seg_map, beta, gamma)
return img_add

Create a function to convert each pixel label to colour.

def grayscale_to_colour(image): print('Converting image...') image = image.reshape((H, W, 1)) image = np.repeat(image, 3, axis=-1) for i in range(image.shape[0]): for j in range(image.shape[1]): label = int(image[i][j][0]) image[i][j] = np.array(label_to_colours[label])

return image

def model_run(image):

return predictions   

with tf.Graph().as_default() as graph:

image_tensor = tf.placeholder(tf.float32, [None, None, 3])
images = tf.expand_dims(image_tensor,0)

#Create the model inference
with slim.arg_scope(ENet_arg_scope()):
    logits, probabilities = ENet(images,
                                 num_classes=num_classes,
                                 batch_size=batch_size,
                                 is_training=False,
                                 reuse=None,
                                 num_initial_blocks=num_initial_blocks,
                                 stage_two_repeat=stage_two_repeat,
                                 skip_connections=skip_connections)

variables_to_restore = slim.get_variables_to_restore()
saver = tf.train.Saver(variables_to_restore)
def restore_fn(sess):
    return saver.restore(sess, checkpoint)

predictions = tf.argmax(probabilities, -1)
predictions = tf.cast(predictions, tf.float32)
print('HERE', predictions.get_shape())

sv = tf.train.Supervisor(logdir=None, init_fn=restore_fn)

with sv.managed_session() as sess:
     now = 0.0
     while(cap.isOpened()):
          ret, frame = cap.read()
          if (ret == False):
              print('~~~~~~~~~~~~~~~~~~did not get any frame~~~~~~~~~~~~~~~~~~')
              break  
          image = frame.copy()             
          image = np.asarray(image, np.float32)/255           
          print('~~~~~~~~~~~~~~',sess.run(image_tensor))
          segmentations = sess.run(predictions, feed_dict={image_tensor:image})

          #cv2.imshow('pre',segmentations[0])
          #cv2.waitKey(0)

          #T = time.time() - now
          #print(int(1/T))
          #now = time.time()

cap.release() cv2.destroyAllWindows()

===============================================================================

electronicYH commented 6 years ago

I have fix it , the point is placeholder, it has to set H and W not instead of None

zmqp111 commented 6 years ago

hi electronicYH.

I tried to video predict using your code, but i failed.

I got this error.

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Placeholder' with dtype float and shape [480,360,3] [[Node: Placeholder = Placeholder[dtype=DT_FLOAT, shape=[480,360,3], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]] [[Node: Placeholder/_2837 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_4_Placeholder", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

I have no idea why i got this error..

Do you have a idea??

Help me plz..

Thanks.