Are you going to implement some fast conv algorithm like winograd,fft?

ermig1979 / Synet

A small framework to infer neural network

MIT License

140 stars 26 forks source link

Are you going to implement some fast conv algorithm like winograd,fft? #4

Closed szad670401 closed 4 years ago

szad670401 commented 6 years ago

the im2col + gemm maybe a little slow.

ermig1979 commented 6 years ago

Hello. I have plans to make the inference of convolution layer so fast as possible. So I will try to implement these methods when I finish current tasks connected with compatibility Synet and Caffe and Tensorflow models.

szad670401 commented 6 years ago

thank you. I found the opencv dnn have implemented the tensorflow importer. you can refer to its desgin. but the dnn module can't compute with low precison , therefore it run a little bit slow on ARM. I very look forward you will work on low precision computing.

ermig1979 commented 6 years ago

I have seen this importer. It was useful in order to understand inner structure of Tensorflow. Unfortunately it is not enough in order to convert complicated models.

szad670401 commented 5 years ago

yes. tensorflow has two much ops. some chinese company open source some mobile tiny CNN inference framework. I hope their code will help for you.

Reference: 1.Mace 2.NCNN 3.FeatherCNN

ermig1979 commented 5 years ago

Thank you for information. And I want to note that I have implemented Winograd 3x3 2x2 convolutions. It gives performance improvement in some cases.

szad670401 commented 5 years ago

A great work. I have seen you have implemented most of common layer. Have you planed to give out some benchmark tests that comparing with other frameworks?

ermig1979 commented 5 years ago

I I periodically do performance comparison between Synet and (Darknet/Caffe/Tensorflow/Opencv:Dnn). Synet contains tests which compare Synet and Darknet. Other comparison tests are not ready to publication. I will try to do they. Now I am experementing with image format. In some cases NHWC is faster then NCHW.

ermig1979 commented 4 years ago

Winograd's algorithms with kernel 3x3 and window 2x2, 3x3, 4x4 were implemented.