devkitPro / libnds

C library for Nintendo DS
http://devkitpro.org/viewforum.php?f=38
Other
320 stars 46 forks source link

Faster integer sine/cosine #55

Open Kuratius opened 8 months ago

Kuratius commented 8 months ago

Feature Request

What feature are you suggesting?

Overview:

Adding a fast integer sine/cosine without LUTs to libnds

http://www.coranac.com/2009/07/sines/

Smaller Details:

This can most likely be modified to yield a cosine by removing one of the shifts, specifically this line sub r0, r0, #1<<31 @ r0 -= 1.0 ; sin <-> cos

@ ARM assembly version of S4 = C4(gamma-1), using n=13, A=12 and ... miscellaneous.

@ A sine approximation via a fourth-order cosine
@ @param r0   Angle (with 2^15 units/circle)
@ @return     Sine value (Q12)
    .arm
    .align
    .global isin_S4a9
isin_S4a9:
    movs    r0, r0, lsl #(31-13)    @ r0=x%2 <<31       ; carry=x/2
    sub     r0, r0, #1<<31          @ r0 -= 1.0         ; sin <-> cos
    smulwt  r1, r0, r0              @ r1 = x*x          ; Q31*Q15/Q16=Q30

    ldr     r2,=14016               @ C = (1-pi/4)<<16
    smulwt  r0, r2, r1              @ C*x^2>>16         ; Q16*Q14/Q16 = Q14
    add     r2, r2, #1<<16          @ B = C+1
    rsb     r0, r0, r2, asr #2      @ B - C*x^2         ; Q14
    smulwb  r0, r1, r0              @ x^2 * (B-C*x^2)   ; Q30*Q14/Q16 = Q28
    mov     r1, #1<<12
    sub     r0, r1, r0, asr #16     @ 1 - x^2 * (B-C*x^2)
    rsbcs   r0, r0, #0              @ Flip sign for odd semi-circles.

    bx      lr

Nature of Request:

Addition

Why would this feature be useful?

4x Faster sine/cosine calculation than LUTs

image