Open Hasan-Torabi opened 3 years ago
Hi, The libraries already use multicore capabilities of CPUs during matrix multiplications. The number of threads is adjusted dynamically by default. You can use CeNiN.CBLAS.setNumThreads() to change it.
Thanks. Actually I am working on crowd counting project that counts number of people in image. My model is in PyTorch and has 19 convolution layer with some relu and pooling layer, I converted the model to CENIN format file successfully , when I import CENIN file in CeNiN_CSharp_Example program it shows "42+2 layers, 21493568 weights and 5889 biases were loaded in 0.808 seconds". The number of weights and biases is correct. I also changed output layer based on my model but the result is same for every input image. I also added cblas_dgemm function for double operations to increase accuracy, but again the result is same for every input image. what do you think about this problem? Thanks.
If output of your network is a number (number of people) you may need to implement a new output layer in CeNiN for your network arthitecture. Since I don't know your exact architecture, I can't suggest a certain structure. Please email me for further discussion.
Hi Huseyin, Is it possible that run Conv(use CBLAS=true) on multicore CPUs for increasing speed of run time ?