warmspringwinds / pytorch-cpp

Pytorch C++ Library
369 stars 68 forks source link

test speed #3

Open jjn037 opened 6 years ago

jjn037 commented 6 years ago

Have you tested the speed? I get a lower speed(30ms/img) with resnet18 224*224 bachsize1

jjn037 commented 6 years ago

auto output_tensor = CPU(kByte).tensorFromBlob(data, {output_height, output_width, 3});

spend an abnormal time

warmspringwinds commented 6 years ago

Sorry for the late reply

@jjn037 This piece of code is slow because you transfer the data from gpu to cpu -- this is usually an expensive operation and should be slow in the original pytorch too.

Would be cool if you can compare the timing of the cpp line with a pytorch's one: output.cpu() and see if there is a significant difference in runtime

warmspringwinds commented 6 years ago

FYI, I have just added a file with a speed benchmark: https://github.com/warmspringwinds/pytorch-cpp/blob/master/examples/resnet_18_8s_benchmark.cpp