atasoyhus / CeNiN

An implementation of feed-forward phase of deep Convolutional Neural Networks in pure C#
Apache License 2.0
31 stars 10 forks source link

Conv (useCBLAS=true) on multicore CPUs #2

Open Hasan-Torabi opened 3 years ago

Hasan-Torabi commented 3 years ago

Hi Huseyin, Is it possible that run Conv(use CBLAS=true) on multicore CPUs for increasing speed of run time ?

atasoyhus commented 3 years ago

Hi, The libraries already use multicore capabilities of CPUs during matrix multiplications. The number of threads is adjusted dynamically by default. You can use CeNiN.CBLAS.setNumThreads() to change it.

Hasan-Torabi commented 3 years ago

Thanks. Actually I am working on crowd counting project that counts number of people in image. My model is in PyTorch and has 19 convolution layer with some relu and pooling layer, I converted the model to CENIN format file successfully , when I import CENIN file in CeNiN_CSharp_Example program it shows "42+2 layers, 21493568 weights and 5889 biases were loaded in 0.808 seconds". The number of weights and biases is correct. I also changed output layer based on my model but the result is same for every input image. I also added cblas_dgemm function for double operations to increase accuracy, but again the result is same for every input image. what do you think about this problem? Thanks.

atasoyhus commented 3 years ago

If output of your network is a number (number of people) you may need to implement a new output layer in CeNiN for your network arthitecture. Since I don't know your exact architecture, I can't suggest a certain structure. Please email me for further discussion.