Open ycjing opened 4 years ago
The inference speed is increased on particular accelerators that support bit-wise operations. Of course it doesn't make it fast to train/inference the model on GPUs.
Hi @resurgo97
Thank you for the response! I appreciate it. Yes, I understand that the improvement in speed can be only achieved on devices that support bit-wise operations. But I wonder how to derive the theoretical improvement in speed (e.g., 58x and 32x). Thank you!
Best, Yongcheng
Hi @mrastegari
Thank you for the nice paper and code! The work is really impressive. I have a question about the paper. As mentioned in other issues #11 , Torch does not support bit operations. But in the paper, there is a statement: "This results in 58× faster convolutional operations (in terms of number of the high precision operations) and 32× memory savings". I would appreciate it if you could explain the way to compute these values (i.e., 58x and 32x). Thank you! Wish you all the best!
Best, Yongcheng