Open Planck35 opened 5 years ago
@planck35 By computation do you mean the amount of computation (i.e. number of floating-point operations)? If so, then no, the amount of computation would be roughly the same after pruning but should not increase.
@larry0123du Hi,I used the code in my model,but the pruning model size is perfectly equal to the model size before pruning. What is the reason for the phenomenon?
The reason is that the weights are simply truncated to zero but zero is still represented as a floating point number. So in essence, as long as the size of matrices is unchanged, your model would not change in size. In the original paper Han et al supplemented with a Huffman encoding scheme which would boost the performance if I remembered right.
@larry0123du ok ,thanks!
@larry0123du hello,the weights are simply truncated to zero,will the inference speed increase?
You did a very nice implement But I want to ask for the weight that got masked by zero in weights.
Did the whole computation increase but weight's value are zero? or the computation speed is just normal?