Optimal `frac_bits` calculation

What is the goal?

Fixed bit width (total_bits) for all values in a model
Fractional part should be calculated depending on the minimum and maximum values

Mathematical relationship

Given fixed bit width ($b{total}$) and a variable bit width for the fractional part ($b{fractional}$) for representing numbers. $Q$ is the set of all quantized values with given $b{total}$ and $b{fractional}$.

Then holds:

$$\min(Q) = -1 \frac{2^{b{total} - 1}}{2^{b{fractional}}} = -1 2^{b{total} - b{fractional} - 1}$$

$$\max(Q) = \frac{2^{b{total}} - 1}{2^{b{fractional}}}$$

This leads to the following optimal frac_bits calculation for a given input tensor $T$ and fixed $b_{total}$.

If $|\min(T)| \geq |\max(T)|$ $$b{fractional} = \text{clamp}(b{total} - \lfloor \log2(|\min(T)|) \rceil - 1, 0, b{total} - 1)$$ Else if $|\min(T)| < |\max(T)|$ $$b_{fractional} = \text{clamp}(\lfloor \log2(\frac{2^{b{total}} - 1}{\max(T)}) \rceil, 0, b_{total} - 1)$$

With this calculation you get a pair ($b{total}$, $b{fractional}$) that can be used to update the parameters of the arithmetics to automatically optimize the fixed point representation.

Required code structure

General code structure for adaptable parameters for Arithmetics depending on the input values of the quantize function (not limited to fixed point).

classDiagram

class Sequential
class CreatorLayer {
    +register_arithmetics(arithmetics : Arithmetics)
}
class Arithmetics {
    +quantize(inputs : Tensor) Tensor
}
class ConcreateArithmetics {
    -quantization_params
}

Sequential "1" o-- "*" CreatorLayer
Sequential "1" *-- "1" Arithmetics

Arithmetics <|.. ConcreateArithmetics
CreatorLayer "1" *-- "1" Arithmetics

note for Sequential "for layer in layers:\nlayer.register_arithmetics(global_arithmetics)"
note for ConcreateArithmetics "updates the parameters according to the inputs of the quantize function"

Problems

How to initialize the arithmetics of a CreatorLayer?
- If not part of a Sequential?

julianhoever commented 1 year ago

These are my first thoughts how to approach this issue. The formulas in the mathematical section may be wrong. I have to verify them next week (just a first shot). The proposed architecture is also a bit problematic. Maybe @glencoe you can have a look at it?

es-ude / elastic-ai.creator