Closed Kuratius closed 8 months ago
GCC apparently doesnt use .align the way the arm documentation does.
It's mentioned in the docs at https://sourceware.org/binutils/docs/as/Align.html but easy to miss if you don't know it's there.
For other systems, including ppc, i386 using a.out format, arm and strongarm, it is the number of low-order zero bits the location counter must have after advancement. For example ‘.align 3’ advances the location counter until it is a multiple of 8. If the location counter is already a multiple of 8, no change is needed.
fwiw. The nds has no FPU. We strongly recommend not using floating point math on this system.
It's in the context of the sm64 DSi port, which would require a lot of rewriting to remove literally every floating point use.
The example I posted is from AngelTomkins (they also wrote a version for converting floats back to fixed point) and was written when we were checking to see if it was possible to replace some of the sine/cosine functions that rely on LUTs and float polynomials (sm64 has two different sine implementations for some reason) with fixed point versions based on some extremely fast assembly and convert that fixed point number to a float faster than a LUT table lookup from main ram. The function is apparently 2x as fast as libnds' default fixed to float conversion, but only works with fixed point numbers that use 2^-12 as the quantization units and probably only works in a somewhat narrow range, but for sine you only need -1 to 1 .
In general the equation that you need to solve to convert a fixed point number into a float is
fixedpointNumberquantizationunit (e.g. 2^-12) = (1+ mantissa/2^23)2^(exponent-127)
with the restriction that the exponent needs to be chosen in a way such that the mantissa is a positive number. This is mostly doable with shifts, adds, logical operations and a custom arm instruction that counts leading zeroes in a number, instead of a floating point multiply.
Bug Report
What's the issue you encountered?
DevKitPro's gcc may generate wrong alignment directives. This may be a bug in gcc rather than devkitpro.
How can the issue be reproduced?
Environment?
ubuntu
Additional context?
https://discourse.llvm.org/t/arm-doesnt-align-code-sections-by-4-bytes/5830#!
It seems that gcc ignores alignment directives.
this, together with the makefile from the helloworld example, modified to include -march=armv5te in the arch section, and -S in the CFlags section, generates the following assembly:
Arm code must have .align 4, not 2.
Imo this is probably still a bug, but probably a bug in gcc. The correct thing to do would be to generate .align 4.