Closed michaelholm-ce closed 7 years ago
Michael, the code just decreases precision of the .caffemodel and emulates that decreased precision in conventional floating-point format. So, there should be no change in inference time when executed on GPU in floating-point.
@gudovskiy Thank you. So, it looks like to obtain the speed-up, I need to implement the paper from scratch. Is that correct?
@michaelholm-ce It is going to be hard to obtain speed-up on GPUs due to inefficient support of shift ops. At the same time, for pure binarized networks there is a workaround to increase speed according to https://arxiv.org/pdf/1602.02830.pdf . So, this paper calls for new architectures which are not here yet.
@gudovskiy I see, thank you. How might one go about training a shift-cnn network in cpu-only-mode to compare inference timing on cpu?
Thank you for your work. I have little confusion.
@ananddb90,
@gudovskiy thank you for your reply
I didn't understand exactly : if script enables low-precision within caffe then weights and gradient will also be computed for low-precision ? thus, saved caffemodel should also store low-precision weights/bias please correct me if my understanding is wrong as I do not have deep understanding on low-precision
yes, I am trying on drive px2
@ananddb90
@gudovskiy do you mind if you can send me your mail address on ananddb90@gmail.com I would like to ask some very specific doubts to my thesis task. It would be really helpful if you could guide me. thank you
After applying the shift cnn code to my caffemodel, inference time went from 49ms/image to 77ms/image. I must be missing how this code it intended to be used, as I anticipated a potential decrease in inference time, not an increase. Any direction here?