Add aditional rounding modes

This pull request introduces some additional rounding modes, and provides a table, that more accurately describes their behavior. Concretely, the following table has been added to _docs/qonnx-custom-ops/quantop.md:

Number \ ROUNDING_MODE	ROUND=HALF_EVEN	CEIL	FLOOR	UP	DOWN	HALF_UP	HALF_DOWN
5.5	6	6	5	6	5	6	5
2.5	2	3	2	3	2	3	2
1.6	2	2	1	2	1	2	2
1.1	1	2	1	2	1	1	1
1.0	1	1	1	1	1	1	1
-1.0	-1	-1	-1	-1	-1	-1	-1
-1.1	-1	-1	-2	-2	-1	-1	-1
-1.6	-2	-1	-2	-2	-1	-2	-2
-2.5	-2	-2	-3	-3	-2	-3	-2
-5.5	-6	-5	-6	-6	-5	-6	-5

The newly introduced rounding modes are: UP, DOWN, HALF_UP, and HALF_DOWN. These rounding modes were inspired by rounding modes in the java math library (https://docs.oracle.com/javase/8/docs/api/java/math/RoundingMode.html), and the implementation in the Chisel dsptools library (https://github.com/ucb-bar/dsptools/blob/master/src/main/scala/dsptools/numbers/chisel_types/FixedPointTypeClass.scala#L156).

This issue partially solves the incompatibility between a high-level python implementation and a circuit implementation. For instance, consider the following test function for QKeras (v0.9.0):

def test_quantized_bits_rounding_mode():
    alpha1 = qkeras.quantized_bits(bits=3, integer=2, keep_negative=True, alpha=1)
    alpha111 = qkeras.quantized_bits(bits=3, integer=2, keep_negative=True, alpha=[1, 1, 1])
    alpha_po2 = qkeras.quantized_bits(bits=3, integer=2, keep_negative=True, alpha='auto_po2')
    try:
        assert np.array_equal(alpha1(np.array([2.5, 2.5, 3.5])), alpha111(np.array([2.5, 2.5, 3.5])))
        assert np.array_equal(alpha1(np.array([2.5, 2.5, 3.5])), alpha_po2(np.array([2.5, 2.5, 3.5])))
    finally:
        print(alpha1.scale)
        print(alpha111.scale)
        print(alpha_po2.scale)

The function above will fail on the second assert. However, the scaling factors printed in the finally block will be 1, [1,1,1] and [1,1,1]. The reason is that when using "auto_po2" the rounding mode is actually "round half up". This can be seen on: https://github.com/google/qkeras/blob/67e7c6b8cbd6befd594f142187ac4b73b35512ac/qkeras/quantizers.py#L570C45-L570C46

v = tf.floor(tf.abs(x) / scale + 0.5)

This pull request does the following:

Adds rounding modes to spec.
Ads implementation of the rounding modes to resolve_rounding_mode function in src/qonnx/custom_op/general/quant.py.
Ads a simple test to check the implementation of the rounding modes tests/custom_op/test_rounding_mode.py.

The request does NOT do the following:

It does not fix the QKeras/Brevitas converters.

I refrained from updating the converters because I don't know the code base very well, and secondly the tests seem to be written with _assertallclose, i.e. approximate compatibility. Issues with rounding modes can be quite subtle, so they would be hard to catch with approximate compatibility.

I have had success making a bit accurate conversion between QKeras and circuits in chisel4ml, after I introduced precise rounding modes. However, this is only when all tensors had a known quantization, and the scaling factor is power-of-two. Looking at the qonnx code base, I have a hard time seeing how the input quantization is specified. In chisel4ml for instance, this is done directly as shown:

x = x_in = tf.keras.layers.Input(shape=3)
x = qkeras.QActivation(
    qkeras.quantized_bits(bits=4, integer=3, keep_negative=True)
)(x)
x = qkeras.QDense(
    4,
    kernel_quantizer=qkeras.quantized_bits(
        bits=4, integer=3, keep_negative=True, alpha=np.array([0.5, 0.25, 1, 0.25])
    ),
)(x)
x = qkeras.QActivation(qkeras.quantized_relu(bits=3, integer=3))(x)
x = qkeras.QDense(
    1,
    kernel_quantizer=qkeras.quantized_bits(
        bits=4, integer=3, keep_negative=True, alpha=np.array([0.125])
    ),
)(x)
x = qkeras.QActivation(qkeras.quantized_relu(bits=3, integer=3))(x)
model = tf.keras.Model(inputs=[x_in], outputs=[x])

This means that the inputs must be quantized to a signed 4-bit integer. I realize that qonnx targets a larger subset of neural network descriptions, however, I believe that it would be useful to make a distinction for these kind of networks(https://arxiv.org/abs/2011.10680 this paper calls them Dyadic Neural networks), as:

they are highly efficient to implement in hardware, and
I believe they can be "simulated" with bit-level accuracy using floating-point operations.

I have only empirically shown bit-level accuracy, however, considering the way floating-point is specified (having a power-of-two exponent bits) the equivalence should hold, as long as the mantisa/fraction field is not to big. And if it does get to big, you can also move to 64-bit floating-point number for example.

fastmachinelearning / qonnx

Add aditional rounding modes #110