Closed adhere closed 7 years ago
Hi, First of all, how large is your model? If you sum up the size of your network weights, how many MBs it would be? Actually, if you want to implement any part of the CNNdroid, it would be better to implement pooling layer and try to parallelize it in C++. Network layer is just a high level interface that calls the initialization and computation functions of each layer in the network, so I don't think that you would get a significant performance gain by re-implementing it in C++. You can contact me by email if you want to get more detailed information or ask any question that you have.
I test My model on CNNDroid, it expends about 900ms forwarding one times in parallel mode, while expending abount 40s one times in sequential mode. Although in parallel mode with RenderScript it speeds up a lot, it is still too slow。 My goal is 400ms forwarding one times。 So, i want to use C++ to implement the CNNDroid Network part( replace current java code), Is it possible to speed up 2X(All i will do just replace Java Code, because i don't have any idea to optimize RenderScript or it is too perfect )?