Closed badlooop closed 7 years ago
this implementation uses only CPU, hence the slowness. The proper way would be to convert the CNN layers to use Metal Performance Shader, where it will be at least 10x faster. FYI
@yan7109 I've tested Mobilenet with Forge (a metal accelerated DL framework). It runs at 30fps, 2x faster than TF-ios.
CPU running is definitely one reason, but I think code optimization is another reason for the speed issue. In the "tensorflow_utils.mm", I draw the result box by using the DrawBox function in the original TF-ios example. The running speed of this function pretty slow since it changes the image pixels in the memory. Also the memcpy function takes quite much time to process. This is just an early phase of the project, I will keep on optimizing the code and try to include Metal into the CNN layers :) (ps: the runtime is 2fps on 7plus, but the accuracy is much higher than the Google's multibox example)
Hi JieHe,
Thanks a lot for sharing your awesome work. I have tried to build your app following your instructions. But the speed is much slower than I expected.
I am using a iPhone 6s plus to do the test. The speed I've got is around 1 sec per frame. I was wondering what speed you have got. Do you have any idea to improve the speed to get it realtime?