BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
34.14k stars 18.68k forks source link

PyCaffe: slowdown of GPU acceleration (CPU is ok) #6031

Open AleximusOrloff opened 7 years ago

AleximusOrloff commented 7 years ago

Hi all, sorry, but I insist that issue #6022 still open

Issue summary

Forward pass speed depends on previous image size

Steps to reproduce

h=16
w=16
im_data = np.empty([h, w, 3])
print(im_data.shape)
im_data = np.swapaxes(im_data, 0, 2)
im_data = np.array([im_data], dtype = np.float)
Net.blobs['data'].reshape(1, 3, w, h)
Net.blobs['data'].data[...] = im_data
k=0
start_time = time.time()
while k<100:
        out = Net.forward()
        k=k+1
print "Time to pass 16x16 = %f" %(time.time() - start_time)
h=512
w=512
im_data = np.empty([h, w, 3])
print(im_data.shape)
im_data = np.swapaxes(im_data, 0, 2)
im_data = np.array([im_data], dtype = np.float)
Net.blobs['data'].reshape(1, 3, w, h)
Net.blobs['data'].data[...] = im_data
k=0
start_time = time.time()
while k<100:
        out = Net.forward()
        k=k+1
print "Time to pass 512x512 = %f" %(time.time() - start_time)
h=16
w=16
im_data = np.empty([h, w, 3])
print(im_data.shape)
im_data = np.swapaxes(im_data, 0, 2)
im_data = np.array([im_data], dtype = np.float)
Net.blobs['data'].reshape(1, 3, w, h)
Net.blobs['data'].data[...] = im_data
k=0
start_time = time.time()
while k<100:
        out = Net.forward()
        k=k+1
print "Time to pass 16x16 once again = %f" %(time.time() - start_time)

result >> (16, 16, 3) Time to pass 16x16 = 0.139678 (512, 512, 3) Time to pass 512x512 = 1.097116 (16, 16, 3) Time to pass 16x16 once again = 0.900323

Noiredd commented 7 years ago

Now this is interesting. I managed to reduce your example to this:

import os
os.environ['GLOG_minloglevel'] = '2'
import caffe
import time
import numpy as np

caffe.set_device(0)
caffe.set_mode_gpu()

def run_shape(net, h, w=None, B=4, N=100):
    if w is None: w = h
    im_data = np.zeros((3, w, h), dtype = np.float)
    net.blobs['data'].reshape(B, 3, w, h)
    net.blobs['data'].data[0][...] = im_data
    net.forward() #force reshape before time measurement
    k=0
    start_time = time.time()
    while k<N:
        net.forward()
        k=k+1
    print "Time to pass %s = %f" %(im_data.shape, time.time() - start_time)

net = caffe.Net('n.proto.txt', caffe.TEST)
run_shape(net, 16)
net.blobs['data'].reshape(4,3,512,512)
net.reshape()
run_shape(net, 16)

Run it with the following simple net: n.proto.txt.
I get
Time to pass (3, 16, 16) = 0.018211
Time to pass (3, 16, 16) = 1.055989 only due to reshaping the network between forward passes.

This is definitely not the behavior I would expect to see...