Shift CNN increased inference time.

gudovskiy / ShiftCNN

A script to convert floating-point CNN models into generalized low-precision ShiftCNN representation

55 stars 17 forks source link

Shift CNN increased inference time. #1

Closed michaelholm-ce closed 7 years ago

michaelholm-ce commented 7 years ago

After applying the shift cnn code to my caffemodel, inference time went from 49ms/image to 77ms/image. I must be missing how this code it intended to be used, as I anticipated a potential decrease in inference time, not an increase. Any direction here?

gudovskiy commented 7 years ago

Michael, the code just decreases precision of the .caffemodel and emulates that decreased precision in conventional floating-point format. So, there should be no change in inference time when executed on GPU in floating-point.

michaelholm-ce commented 7 years ago

@gudovskiy Thank you. So, it looks like to obtain the speed-up, I need to implement the paper from scratch. Is that correct?

gudovskiy commented 7 years ago

@michaelholm-ce It is going to be hard to obtain speed-up on GPUs due to inefficient support of shift ops. At the same time, for pure binarized networks there is a workaround to increase speed according to https://arxiv.org/pdf/1602.02830.pdf . So, this paper calls for new architectures which are not here yet.

michaelholm-ce commented 7 years ago

@gudovskiy I see, thank you. How might one go about training a shift-cnn network in cpu-only-mode to compare inference timing on cpu?

ananddb90 commented 7 years ago

Thank you for your work. I have little confusion.

if it generates low precision caffemodel then why size remains same ?
to utilize low precision caffemodel, how should I proceed considering I have hardware which support upto 4 bit thank you

gudovskiy commented 7 years ago

@ananddb90,

The script just emulates low-precision within Caffe. If you want to get weights, you need to save weights into separate file.
Does your hardware support shifts of integers?

ananddb90 commented 7 years ago

@gudovskiy thank you for your reply

I didn't understand exactly : if script enables low-precision within caffe then weights and gradient will also be computed for low-precision ? thus, saved caffemodel should also store low-precision weights/bias please correct me if my understanding is wrong as I do not have deep understanding on low-precision
yes, I am trying on drive px2

gudovskiy commented 7 years ago

@ananddb90

It doesn't enable, only emulates low-precision in original floating-point.
For Drive PX2 you can use Nvidia's software with 8-bit fixed-point. To do lower precision, you need to implement yourself layers.

ananddb90 commented 7 years ago

@gudovskiy do you mind if you can send me your mail address on ananddb90@gmail.com I would like to ask some very specific doubts to my thesis task. It would be really helpful if you could guide me. thank you