es-ude / elastic-ai.creator

elastic ai.creator
MIT License
16 stars 2 forks source link

Optimal `frac_bits` calculation for fixed point arithmetics #308

Open julianhoever opened 1 year ago

julianhoever commented 1 year ago
julianhoever commented 1 year ago

Optimal frac_bits calculation

What is the goal?

Mathematical relationship

Given fixed bit width ($b{total}$) and a variable bit width for the fractional part ($b{fractional}$) for representing numbers. $Q$ is the set of all quantized values with given $b{total}$ and $b{fractional}$.

Then holds:

$$\min(Q) = -1 \frac{2^{b{total} - 1}}{2^{b{fractional}}} = -1 2^{b{total} - b{fractional} - 1}$$

$$\max(Q) = \frac{2^{b{total}} - 1}{2^{b{fractional}}}$$

This leads to the following optimal frac_bits calculation for a given input tensor $T$ and fixed $b_{total}$.

If $|\min(T)| \geq |\max(T)|$ $$b{fractional} = \text{clamp}(b{total} - \lfloor \log2(|\min(T)|) \rceil - 1, 0, b{total} - 1)$$ Else if $|\min(T)| < |\max(T)|$ $$b_{fractional} = \text{clamp}(\lfloor \log2(\frac{2^{b{total}} - 1}{\max(T)}) \rceil, 0, b_{total} - 1)$$

With this calculation you get a pair ($b{total}$, $b{fractional}$) that can be used to update the parameters of the arithmetics to automatically optimize the fixed point representation.

Required code structure

General code structure for adaptable parameters for Arithmetics depending on the input values of the quantize function (not limited to fixed point).

classDiagram

class Sequential
class CreatorLayer {
    +register_arithmetics(arithmetics : Arithmetics)
}
class Arithmetics {
    +quantize(inputs : Tensor) Tensor
}
class ConcreateArithmetics {
    -quantization_params
}

Sequential "1" o-- "*" CreatorLayer
Sequential "1" *-- "1" Arithmetics

Arithmetics <|.. ConcreateArithmetics
CreatorLayer "1" *-- "1" Arithmetics

note for Sequential "for layer in layers:\nlayer.register_arithmetics(global_arithmetics)"
note for ConcreateArithmetics "updates the parameters according to the inputs of the quantize function"

Problems

julianhoever commented 1 year ago

These are my first thoughts how to approach this issue. The formulas in the mathematical section may be wrong. I have to verify them next week (just a first shot). The proposed architecture is also a bit problematic. Maybe @glencoe you can have a look at it?