google / qkeras

QKeras: a quantization deep learning library for Tensorflow Keras
Apache License 2.0
537 stars 104 forks source link

max method of quantized_bits returns incorrect values #71

Open kantic opened 3 years ago

kantic commented 3 years ago

Hi,

I've noticed that the max method of quantized_bits doesn't return correct values. According to the documentation, the max method should return the largest value which can be represented by the quantizer. Defining an unsigned 8-Bit quantization with zero integer bits, the quantizer correctly quantizes the value 1.0 to 0.99609375, which is the larges number which can be represented in this configuration. But the max method returns 1.0.

Minimum example:

import qkeras as qk

quantizer = qk.quantized_bits(8, 0, 0, False)
x = 1.0
xq = quantizer(x)
q_max = quantizer.max()
print('x: {0}, xq: {1}, q_max: {2}'.format(x, xq, q_max))

Output:

x: 1.0, xq: 0.99609375, q_max: 1.0

Expected Output:

x: 1.0, xq: 0.99609375, q_max: 0.99609375

kantic commented 3 years ago

I found that, generally, the quantized values seem to not be strictly limited to valid value ranges which are assumed by the specified number of bits in the quantization methods.

Another example of this is quantized_sigmoid using 8 bits in total. In this configuration, I would normally assume that the output of quantized_sigmoid lies in the interval [0.0, 1-(2**-8)] = [0.0, 0.99609375] with a resolution (value step size) of 2**-8 = 0.00390625, because this is what can be represented by using 8 bits. But, the following code example shows that the value range of quantized_sigmoid is equal to [0.0, 1.0]:

import qkeras as qk

qs = qk.quantized_sigmoid(8)
input_value = tf.constant(-1000.0)
output = qs(input_value).numpy()
print('Sigmoid Lower Bound: {0}'.format(output))

input_value = tf.constant(1000.0)
output = qs(input_value).numpy()
print('Sigmoid Upper Bound: {0}'.format(output))

input_value = tf.constant(-0.996)
output = qs(input_value).numpy()
print('Sigmoid Resolution: {0}'.format(output))

Output:

Sigmoid Lower Bound: 0.0 Sigmoid Upper Bound: 1.0 Sigmoid Resolution: 0.00390625

Expected Output:

Sigmoid Lower Bound: 0.0 Sigmoid Upper Bound: 0.99609375 Sigmoid Resolution: 0.00390625

Also, the min and max methods output the interval boundaries of [0.0, 1.0]:

print(qs.min())
print(qs.max())

Output:

0.0 1.0

Expected Output:

0.0 0.99609375

This leads to the fact that for these edge cases the quantization methods output values which are not representable with the specified number of bits. In order to 'encode' the current behaviour of quantized_sigmoid in hardware, it would require 9 bits in total and a special (and inefficient) encoding scheme, in which the upper boundary of 1.0 is also representable. Similar considerations are also true for the other quantization methods. Am I missing some details here? Is this the intended quantization scheme in QKeras?