Open mkaskov opened 8 years ago
Hi
Yes. In fact, we are 1) developing faster mobile GPU algorithms for currently supported layers, and 2) adding compressed neural networks. You are more than welcome to join the project.
Large models, e.g., googlenet, do not fit in the mobile memory and therefore are not part of our development plans, but compressed models, e.g., squeezenet, do fit and we are working on them.
Currently we are focused more on CNNs not RNNs.
Matin
Hi Im glad to see it.
Yes, the computation times are about what we have measured as well. About 1 second for every image.
As I have mentioned, we are adding support for compressed models, e.g., SqueezeNet.
This is great news. it will be interesting to try. Pruning or/and quantize network model gives performance boost? did you test? (preparing scripts from compact models deploy large * .msg files)
Hi, Is there an advantage of using GPU-Accelerated CNNdroid over CPU only tensorflow? Do you have any idea on how faster is that when running on comparable networks?
in tensorflow app google uses inception network. it is one of the best performance network by calculation cost. closed to squeezenet but more accuracy. google uses pruning and quantization to decrease calculation cost also. @matinhashemi said that they planning realize squeezenet. with prunned and quatizated model it will provide performance boost, maybe ....
Hi, We have not tested tensorflow yet.
@mkaskov were you able to run the demo projects on Nexus 5 without any changes? I am having trouble running them as is on a Nexus 5 with Android 6.0 - I've had to make a lot of changes to the project already.
@shrutisharmavsco I did a lot of additions to run the code.
Regarding CNNdroid / Android Tensorflow demo comparison, I did some runtime measurements on HTC One M9.
As mentionned in your paper, with CNNdroid you can process a forward pass within 700 ms for a 16 images batchsize in optimal conditions. For a single image batchsize, runtime is around 1 second for me.
I tested the same AlexNet model by converting it from bvlc caffe to tensorflow .pb file. Running this model instead of default inception model, I checked the mean value of inference time displayed in the tensorflow demo logcat.
With default bazel build, I got 615 ms inference time, which is better than CNNdroid with any GPU usage.
Testing the demo with a gradle build, I realised that inference time was a lot longer, around 1500 ms. In that case (both solutions built with gradle), CNNdroid is twice faster than tensorflow demo.
I got similar difference when I compare inference times for tensorflow demo with inception model building it with bazel/gradle (480ms vs 990ms).
I have very little knowledge about gradle/bazel, does this seems surprising to you to observe such a difference ? Do you think we could get similar improvement building CNNdroid with bazel ?
Quantization and Pruning are on the plan to CNNdroid? Quantized SqueezeNet in CNNDroid should be nice :3
Yes, quantization and pruning are in the pipeline and we're working to add the support for them.
Roughly how close are you to squeezenet support? I am interested in trying that when the time comes.
Sorry for the delayed response, the implementation of the version 1 is kind of ready, and hopefully will be pushed to the git next month.
The underlying library for TensorFlow is Eigen and gemmlowp. Is there benchmark done between the CPU version conv/matmul and GPU shader version?
Hi, Unfortunately, we have not done any of the mentioned benchmarks.
Hello I was looking forward for squeezenet support in CNNdroid, roughly when will it be available?
Hi, It should be ready really soon, will give you the update about the approximate release time shortly.
Nice project very intresting projects. I tested it on nexus5 (snapdragon 800)
Are you planning to develop the project further?
for example