Open jang0977 opened 3 years ago
Hi,
Thanks for your interest in this work.
The sensitivity measure, as well as the genetic algorithm, will be released in the future, we are cleaning this code.
For your question of 2-bit permutation, let's consider an example of a block with 3 layers. You should first check 3+3 per-layer permutations for 4-bit and 8-bit. Then, for 2-bit quantization, you need to evaluate 3 possibilities for one layer quantized to 2-bit, and 3 possibilities for two layers quantized to 2-bit, and finally 1 possibilities for all of 3 layers quantized to 2-bit.
Hi,
Thanks for your interest in this work.
The sensitivity measure, as well as the genetic algorithm, will be released in the future, we are cleaning this code.
For your question of 2-bit permutation, let's consider an example of a block with 3 layers. You should first check 3+3 per-layer permutations for 4-bit and 8-bit. Then, for 2-bit quantization, you need to evaluate 3 possibilities for one layer quantized to 2-bit, and 3 possibilities for two layers quantized to 2-bit, and finally 1 possibilities for all of 3 layers quantized to 2-bit.
Thank you for your reply. I have another question. Do I have to used the pretrained model from your link to test this git? If it is possible, I want to use some pretrained models that pytorch provides like ResNet and MobileNet.
p.s. I found that you added some extra relu layers for basic block and bottleneck. If you do not mind, would you like to tell me the reason for it?
Hi. First of all thanks for your work, very interesting paper. Could you please update if you plan to publish code for mixed-precision with genetic algorithm and hessian off-diagonal loss? Thanks.
Hi,
Thanks for your interest in this work.
The sensitivity measure, as well as the genetic algorithm, will be released in the future, we are cleaning this code.
For your question of 2-bit permutation, let's consider an example of a block with 3 layers. You should first check 3+3 per-layer permutations for 4-bit and 8-bit. Then, for 2-bit quantization, you need to evaluate 3 possibilities for one layer quantized to 2-bit, and 3 possibilities for two layers quantized to 2-bit, and finally 1 possibilities for all of 3 layers quantized to 2-bit.
This have not been released yet?
Hi, it is a nice code for a quantization. However, I have some questions about sensitivity measurement and genetic algorithm.
In the paper, you expressed that layer sensitivity can be measured with diagonal loss while off-diagonal loss expresses cross-layer sensitivity. However, I could not find such a loss term in your code in this git..... if you do not mind, please let me know where you wrote those terms.
It was quite interesting that using genetic algorithm to find the optimal bitwidth configuration for each block. But as well as sensitivity problem, I could not find such a code for this algorithm in the git. Again, if you do not mind it either, please let me know where it is.
By the way, I am really impressed with your paper and code for new quantization method. Hope you to have a nice day. Thank you!
p.s. when you mentioned about examining permutations, you said that it would be 3^n permutations while n is the number of layers in a block. and 3 is the number of bit candidates(2,4,6). However, due to negligible performance drop with 4bits and 8 bits quantization, only 2 bit permutation is considered according to your paper. However, if only 2 bit permutation is considered, shouldn't it be 1^n permutations per each block? I am a bit confused with this part.