Does n-bits quantization with a largerr n make sense in your methods?

AojunZhou / Incremental-Network-Quantization

Caffe Implementation for Incremental network quantization

Other

191 stars 74 forks source link

Does n-bits quantization with a largerr n make sense in your methods? #32

Open XiangyuWu opened 6 years ago

XiangyuWu commented 6 years ago

Hi, Zhou!

Thanks for your work firstly.

When considering about your project, I just wondered that whether it is useful to complement more bits for quantizing. Such a quantization policy which quantizes the real number into powers of 2 means when more bits used, most quantized numbers are around 0. Say if n1 = -1 is set, the largest quantized number is 0.5, and then 0.25, 0.0125 , 0.00625... in order. The larger b seems only has effect in very small numbers, and are this numbers decisive for the model performance?

BTW, how to set n1, n2, b in your project?

Any reply is appreciated!

mdamircoder commented 6 years ago

@XiangyuWu I think, you should use histogram plot, or weight dumping to check the weight distribution (w.r.t any model and dataset). Then choose n1 accordingly. If you can choose n1, it would give you n2.

XiangyuWu commented 6 years ago

@mdamirsohail13 thank you. After reading the code, I have solved the problem.

mdamircoder commented 6 years ago

I think blob.cpp and power2.cpp, these two have been modified/added. Have you gone through some more codes except these two ?

mdamircoder commented 6 years ago

can you tell me what modifications have authors made to re-train only the non-quantized weights using mask_vector ?

I mean, if you could point out the programs..

XiangyuWu commented 6 years ago

@mdamirsohail13 I just made some modifications to blob.cpp and power2.cpp. And for the mask problem, the BP procedure will not upgrade the weights with mask. You can refer to #10.

zzqiuzz commented 6 years ago

@XiangyuWu Hi! I don't think it works as the same as figure2 shown in the paper.For example, weights quantized to be fixed in the previous iteration are refreshed in a new iteration by https://github.com/Zhouaojun/Incremental-Network-Quantization/blob/c7f6a609d5817d8424ce224209cf4c50f1e4de50/src/caffe/blob.cpp#L525 ,so quantized weights from previous iteration are not fixed any more. And what do you think about it ?

XiangyuWu commented 6 years ago

Sorry that I have not read the paper seriously.

I think quantized weights are not fixed thought the whole process. For example, at the first stage, the largest 30% weights are quantized, and secondly, the largest 60% instead of the largest 30% to 60% are quantized, and thirdly, the largest 80%, finally, all weights are quantized.

And for your purpose, set mask_vec[i] = 1 with those data_vec[i] been quantized may help, which will hinder the quantized weights to be updated at another stage . But I don't think it will increase the performance.

zzqiuzz commented 6 years ago

@XiangyuWu Hi ! I think It does fixed by https://github.com/Zhouaojun/Incremental-Network-Quantization/blob/c7f6a609d5817d8424ce224209cf4c50f1e4de50/src/caffe/util/power2.cpp#L8

mdamircoder commented 6 years ago

@zzqiuzz How the authors have modified the code to quantize the weights only once, at the beginning ? Or can you mention the codes where they have made major changes, except for blob.cpp and power2.cpp ?