Oneflow-Inc / oneflow

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.
http://www.oneflow.org
Apache License 2.0
5.79k stars 658 forks source link

Further concern about oneflow.nn.FakeQuantization #8685

Open xxxyyyzzz12345 opened 1 year ago

xxxyyyzzz12345 commented 1 year ago

But when the input is outside the range [quant_min,quant_max], shouldn't the gradient be 0.0 instead of 1.0? The following code snippet sets both quant_min and quant_max to 0 and defines the input tensor as a 1-d tensor ranging from -20 to 20 with a step size of 1:

quantization_formula ='google'
quantization_bit = 1
quantization_scheme = 'symmetric'

for x in range(-20,20):
    input1 = oneflow.tensor([x], dtype=oneflow.float64,requires_grad=True)
    input2 = oneflow.tensor([1], dtype=oneflow.float64,requires_grad=True)
    input3 = oneflow.tensor([0], dtype=oneflow.float64,requires_grad=True)
    mod = oneflow.nn.FakeQuantization(quantization_formula, quantization_bit, quantization_scheme)
    output = mod(input1, input2, input3)
    output.backward()
    print(output)
    print(input1.grad)

Both the outputs (-1, ..., -1, 0, ..., 0) and gradients (1., 1., 1., ...) seem to be incorrect.

Originally posted by @xxxyyyzzz12345 in https://github.com/Oneflow-Inc/oneflow/issues/8649#issuecomment-1187825063

Ldpe2G commented 1 year ago

the oneflow.nn.FakeQuantization Module shoule be used together with oneflow.nn.MinMaxObserver or oneflow.nn.MovingAverageMinMaxObserver

For example:

import oneflow as flow

quantization_formula ='google'
quantization_bit = 1
quantization_scheme = 'affine' 
# 'symmetric': quantize to signed integer, can only be used when quantization_bit >= 2
# 'affine': quantize to unsigned integer

input1 = flow.rand(20, dtype=flow.float64, requires_grad=True)
min_max_observer = flow.nn.MinMaxObserver(
    quantization_formula=quantization_formula,
    quantization_bit=quantization_bit,
    quantization_scheme=quantization_scheme,
)
(scale, zero_point) = min_max_observer(input1)
mod = flow.nn.FakeQuantization(quantization_formula, quantization_bit, quantization_scheme)
output = mod(input1, scale, zero_point)
output.sum().backward()
print(output)
print(input1.grad)

If you want to do 1 bit quantize aware training, you should set the quantization_scheme=affine for the unsigned int quantization.

Also the quantize aware training related modules in OneFlow were actual designed for 8 bit quantization, and not full tested for lower bit.

And as for the gradient issue, why always one. It is because we have followed the idea of the paper Quantizing deep convolutional networks for efficient inference: A whitepaper, see section 2.4 for more details.

And as I said, the quantize aware training related functionality of OneFlow are experimental for now, you can have a try in actual training tasks.

We Appreciate Your Feedback 😄