simonmeister / UnFlow

UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss
MIT License
293 stars 57 forks source link

"Converting sparse IndexedSlices to a dense Tensor of unknown shape. " #27

Closed gsaibro closed 6 years ago

gsaibro commented 6 years ago

Hi Simon,

I'm training your network with other datasets and it takes around 1h20min for each 5000 iterations, is that normal? GTX 1080, CUDA 9.0, TensorFlow 1.7, Cudnn 7005, python 3.5, anaconda3.

I got the following warning, do you get that too?

/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.

simonmeister commented 6 years ago

I think i got it a few times, and the training time doesn't seem unusual to me. On a single GPU, we often had to train over 2 days for 500K iterations. Depending on your setup, the training time can be high for larger image resolution and larger batch size.

gsaibro commented 6 years ago

Thanks so much for your answer, your code is fantastic.

Could you just answer me some questions?

How did you choose the loss weights? I have the impression that the smooth_2nd_weight/occ are quite elevate in comparaison to the ternary/photo weights.

How did you choose the mini-batch size?

Do you have some results/impressions about the steps chosen? (large/small displacement)

I would be glad if you have some numerical results about this.

Thanks, Güinther.

Obs.: I did a function to plot the flow as arrows above the source image, if you are interested:

` def quiver_plot(image, flow_x, flow_y, output_name, output_dir = None, form='png'):

''' Plot the image with the flow represented by colored arrows
    Inputs
            image:         source image, numpy 3 channels image
            flow_x:        horizontal flow,  numpy 1 channel image
            flow_y:        vertical flow,  numpy 1 channel image
            output_name:   name to save as, string
            output_dir:    target_folder

    Outputs
            output_image:  image with the flow arrows, numpy 3 channels                       

'''

size = image.shape[1]+10, image.shape[0]

if type(image) is str:
    image_path = image
    bgr_img = cv2.imread(image_path)
    b,g,r = cv2.split(bgr_img)       # get b,g,r
    rgb_img = cv2.merge([r,g,b])     # switch it to rgb
else:
    rgb_img = image

if type(flow_x) is str:
    flow_x_path = flow_x
    flow_x = cv2.imread(flow_x_path)
else:
    flow_x = flow_x

if type(flow_y) is str:
    flow_y_path = -flow_y
    flow_y = cv2.imread(flow_y_path)
else:
    flow_y = -flow_y

if output_dir is None:
    save_path = output_name + '.' + form
elif not os.path.isdir(output_dir):
    save_path = output_name + '.' + form
else:
    save_path = output_dir + output_name + '.' + form

quiper_size = 50*0.5
quiver_step = int(min(rgb_img.shape[0:2])/quiper_size)

X, Y = np.meshgrid(np.arange(0, rgb_img.shape[1], 1), np.arange(0, rgb_img.shape[0], 1))

fig  = plt.figure(frameon=False,figsize=(rgb_img.shape[1]/5,rgb_img.shape[0]/5),dpi=1)

plt.imshow(rgb_img)
M = np.hypot(flow_x,flow_y)
Q = plt.quiver(X[::quiver_step, ::quiver_step],Y[::quiver_step, ::quiver_step],flow_x[::quiver_step, ::quiver_step],flow_y[::quiver_step, ::quiver_step],M[::quiver_step, ::quiver_step], units='width')
plt.savefig(save_path,bbox_inches='tight',dpi=10,pad_inches = 0)
plt.close()

height, width = image.shape[:2]
saved_image = cv2.imread(save_path)
res_image = cv2.resize(saved_image,(width,height), interpolation = cv2.INTER_CUBIC)
cv2.imwrite(save_path,res_image)
b,g,r = cv2.split(res_image)   # get b,g,r
output_image = cv2.merge([r,g,b])     # switch it to rgb
return output_image

`

simonmeister commented 6 years ago

Thanks, it's great if the code is helpful to some :) I just did a random search across a few relative weightings and i fear there isn't really a more principled way i can think off than just trying it out and observing if the resulting flow fields are too noisy or too smooth or have artifacts. I used similar learning rates and mini batch sizes as the FlowNet papers to reduce the need for hyperparameter searches: 4 or 8 as mini-batch size (depending on the image size of the dataset) and a similar learning rate schedule. I also think much larger mini batches will be difficult to fit into GPU memory as the encoder/decoder nets need more memory than standard encoder-only (e.g. image classification) networks.