Hi there
I have some question about your paper “ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks”
1. why do you use more then one codebook,in some of the codebook ,many of the codeword are the same。
for example the setting N=2,B=4.
C1={0,±2^-1,±2^-2,....,±2^-7}
C2={0,±2^0,±2^-1,....,±2^-6}
most of them are the same,will this increase the computation afford? can we just increase the index not the codeword?what your consideration about this setting?
2.your setting bitwidth b=4,should not it be just four elemnets? such as {0,2^-1,2^-2,2^-3}.
I found your setting is the same as INQ(Incremental Network Quantization).I am so confused by this ,could you give me some clue?
Index can be increased as well at the expense of increase in memory size. That's why it is better just to keep offset for each layer and save on index size.
B=4 means up to 2^4-1=15 entries. Yes, this work is complementary to INQ but concentrates on method without retraining and focuses on hardware architecture.
Hi there I have some question about your paper “ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks”
for example the setting N=2,B=4. C1={0,±2^-1,±2^-2,....,±2^-7} C2={0,±2^0,±2^-1,....,±2^-6} most of them are the same,will this increase the computation afford? can we just increase the index not the codeword?what your consideration about this setting?
I found your setting is the same as INQ(Incremental Network Quantization).I am so confused by this ,could you give me some clue?