Open jouae opened 2 months ago
An array architecture for fast computation of discrete Hartley transform by A. S. Dhar and S. Banerjee, 1991, mentions the use of small-angle approximations for sine and cosine functions to compensate for the need for a scaling factor $K$.
In the method, a magic number $(n−2.852)/3$ is referenced, where this magic number imposes the restriction that the number of bits shifted $j$ in the iteration must be greater than $\left\lceil \frac{n-2.585}{3} \right\rceil$.
This magic number is derived from the error estimation obtained from the remainder terms of the Taylor series approximation for sine and cosine functions. If the initial number of shifted bits $j$ does not exceed this magic number, the result of the approximation will not be accurate.
The detailed derivation process is recorded in the note: Scaling-free method of CORDIC.
When calculating the
twin_tan()
, division is used, specifically $\frac{\sin x}{\cos x}$To minimize the use of the division operator, you can refer to the division implementation mentioned in https://hackmd.io/@sysprog/linux-intdiv, which describes the implementation in
jemalloc
to avoid direct use of the division operator.Currently, testing and experimental data are needed to confirm that using this division method in twin_tan within mado can achieve faster calculation times.
As for the other issue, it's whether twin_tan can be improved further, including enhancements in precision and calculation speed.
Perhaps coordinate rotation digital computer(CORDIC) cloud be a solution.
The note here provide a quick overview of the mathematical principles behind conventional CORDIC and scaling-free CORDIC.
Reference: