Closed mkimhi closed 2 years ago
Hello @mkimhi!
There's a manual initialization of quantization for this.
Please refer to the mobilenet_v2_imagenet_mixed_int_manual.json as a reference example.
There are 3 key things: 1) list of bitwidth per operation scopes:
"compression": {
"algorithm": "quantization",
"initializer": {
"precision": {
"type": "manual",
"bitwidth_per_scope": [
[4, "SqueezeNet/Sequential[classifier]/NNCFConv2d[1]/conv2d_0|WEIGHT"],
...
You can set an arbitrary bitwidth instead of 4 here. This list is dumped to bitwidth_per_scope.json
to the log directory for mixed precision (HAWQ) case only: mobilenet_v2_imagenet_mixed_int_hawq.json
If this list is needed even for fully INT8 quantization scenario, we could add an option to dump it for this case as well. Please let us know.
2) target device should be TRIAL
if you want to set an arbitrary bitwidth. CPU
device doesn't support INT4 at all, VPU
one does support INT4, but with some hardware specific constraints
"target_device": "TRIAL",
3) load pre-trained model and checkpoint via --weights
option, not a --resume
one.
The first option assumes initialization of quantization (bitwidth and quantization ranges), the second - restoring training without any initialization.
It's optional to change other quantization parameters for TRIAL device:
"algorithm": "quantization",
"weights": {
"mode": "asymmetric",
"per_channel": true,
"bits": 4
},
"activations": {
"mode": "asymmetric"
},
Thank you Nikolay,
my desire is to train a network with QAT scheme, and them change a specific layer bit width and fine-tune the model for a little bit more.
thank you
In that case, #1136 can be helpful. bitwidth_per_scope
is printed to console in debug mode starting from it:
from nncf.common.utils.logger import set_log_level
set_log_level(logging.DEBUG)
@mkimhi please let us know whether it helps to achieve your goals and whether we can close the issue
Thanks again, i figured a solution for my usage
for name, m in model.named_modules():
if name== desired_layer:
m.num_bits = desired_bits
am i missing recalculating the scaling when i do that?
@mkimhi, no. quantization scale is staying the same for all bits. No need to recalculate it. The actual quantization of values takes into account the num_bits and changes the total number of quantized values (e.g. 256 for int8, 16 for int4) and interval between them.
then for some reason, high_level get the value of 0 when i'm trying to go to binarization (desired_bits=1 in the code snip above):
i would be very gratful to solve this issue
thank you!
I would like to set a specific bitwidth for a specific layer- post quantization. Is there a way to set the bitwidth directly? I don't want to copy the weight dict and set a new quantization with config
Thanks