HAWQ: Hessian AWare Quantization of Neural Networks With Mixed-Precision - Githubissues

joapolarbear / dl_notes

1 stars 1 forks source link

HAWQ: Hessian AWare Quantization of Neural Networks With Mixed-Precision #13

Open joapolarbear opened 4 years ago

joapolarbear commented 4 years ago

ICCV 2019

Contribution

Automatically select the relative quantization precision of each layer, i.e., use fp16, int8 or one bit

not all layers have the same distribution of floating-point values

the network can have significantly different sensitive to the quantization of each layer.

One-order gradients are not enough to decide the sensitive, so use second-order Hessian matrix (maximum eigenvalue \lambda) to decide the sensitive, furthermore, considering the size of a layer, (denoted as n), sensitive of one layer = \lambda / n.
Use the pre-trained network to calculate Hessian matrix,
matrix-free power iteration algorithm: avoid explicitly form the Hessian due to the large size.

Multi-stage fine-tuning -- define a fine-tuning order

re-train, sort layers in descending order according to the product of \lambda and weight difference.

joapolarbear commented 4 years ago