ZFTurbo / VGG16-Pretrained-C

Pretrained VGG16 neural net in C language
GNU General Public License v3.0
49 stars 17 forks source link

Number of threads #1

Closed jcanore closed 7 years ago

jcanore commented 7 years ago

Hi, if I increase the number of threads the inference time increases (and it should decrease). I'm reviewing the code and I don't find the reason. Do you know why this happens?

Thanks.

ZFTurbo commented 7 years ago

Number of threads should be equal or lower than number of cores you have on your machine.

jcanore commented 7 years ago

Yes, the machine has 2 cores and the inference time is higher with 2 threads than with 1. This is very strange because I'm running the code from the repository without any modification.

Any idea/suggestion?

ZFTurbo commented 7 years ago

1) Check if you compile with OpenMP support. Flag -fopenmp. Also check if program really uses 2 threads. 2) Try to increase number of threads to 4.

Slow down can be possible if overhead to start several threads higher than profit it gives.

jcanore commented 7 years ago

These are the timings I get using the commands indicated in the repository on a different machine (Intel Core i5-2400 CPU @ 3.10GHz, 8GB RAM):

Reading weights (all layers): 42 seconds Reading weights (convolution only): 5 seconds Processing one image (all layers, single core): 19 seconds Processing one image (all layers, 3 cores): 20 seconds Processing one image (convolution only, single core): 17 seconds Processing one image (convolution only, 3 cores): 18 seconds

ZFTurbo commented 7 years ago

Which system? Linux? Windows? Did you check if program really uses 3 threads?

jcanore commented 7 years ago

Linux (Ubuntu 14.04). Yes, I checked it with htop.

This is very strange.

ZFTurbo commented 7 years ago

I will check it on Linux as well. I developed it on Windows, actually. ) Will post result ASAP.

jcanore commented 7 years ago

Great, thanks.

ZFTurbo commented 7 years ago

The problem is with clock() function in multithreaded binaries.: https://stackoverflow.com/questions/10874214/measure-execution-time-in-c-openmp-code

It shows incorrect elapsed time. It actually shows sum of times for all threads. I will try to fix it.

ZFTurbo commented 7 years ago

Problem fixed. You can pull from git.

Log: Using 1 threads Reading weights: 4.863 sec Infer image cat.txt: 21.686 sec

Using 31 threads Reading weights: 5.076 sec Infer image cat.txt: 2.493 sec

jcanore commented 7 years ago

Yes, now it works. Thanks!

Using 1 threads Reading weights: 4.352 sec Infer image cat.txt: 20.440 sec

Using 3 threads Reading weights: 4.348 sec Infer image cat.txt: 10.384 sec