Javanaise / mrboom-libretro

Mr.Boom is an 8 player Bomberman clone for RetroArch/Libretro
http://mrboom.mumblecore.org
MIT License
206 stars 61 forks source link

Helper: Workaround gcc 7.5.0 division bug #79

Closed SimpleTease closed 4 years ago

SimpleTease commented 4 years ago

Avoid incorrect negative division optimization. Fixes bot behavior on Gamecube

frranck commented 4 years ago

that sounds very weird, there's no way to disable the optimisation flag? did you see it by viewing the assembly?

frranck commented 4 years ago

is it a known bug in gcc?

frranck commented 4 years ago

maybe add an #ifdef on GCC_VERSION ?

SimpleTease commented 4 years ago

I checked the output logs first, then some disasm. It's some carry flag optimization because of div 16, and it adds +1 to all negative divisions. But so far only on PPC arch it seems?? No seems to know of it.

-7 to -6
..
-1 to -0

I know GCC can do some funny tricks. Snes9x ran into one about tree somethings. I remember that x / -1 can get compiled into neg x on Intels but ARMs do straight div.

I'll look into the disabling flags. Never run into this exact one before. I've had some MSVC memory indexing errors that affected x86 but not x64.

frranck commented 4 years ago

ok thanks!

SimpleTease commented 4 years ago

I thought it was a compiler bug but it's an annoying PowerPC power-of-2 division optimization that rounds towards zero.

srawi     r6, r6, 4
addze     r6, r6

https://devblogs.microsoft.com/oldnewthing/20180810-00/?p=99465

These four shift instructions are special because they always update carry: The carry bit is set if and only if the original value was negative and any bits shifted out were nonzero. This rule for the carry bit allows you to follow the right-shift instruction with an addze to perform a division by a power of two that rounds toward zero. (If you omit the addze, then the right shift performs a division by a power of two that rounds toward minus infinity.)

Maybe that explains why I see some libretro developers use bitshift division everywhere. Because of platforms like this one. So revisiting the code to make it more usable?

https://github.com/Javanaise/mrboom-libretro/blob/8c9dc9b6c18f304f8c7b3514e3e5f5f5524ff645/ai/MrboomHelper.hpp#L118-L122

I'd have to test on Gamecube but can't imagine why it'd break.

int inline getAdderX(int player)
{
#if CELLPIXELSSIZE == 16
   return(GETXPIXELSTOCENTEROFCELL(player) * framesToCrossACell(player) >> 4);
#else
   return(GETXPIXELSTOCENTEROFCELL(player) * framesToCrossACell(player) / CELLPIXELSSIZE);
#endif
}
frranck commented 4 years ago

yeah that would look better, also it's probably called a lot.

SimpleTease commented 4 years ago

Not tested on real hardware. I believe you're right as PPC interpreter works with original math. And negative bitshift optimization does break on all platforms without the carry addition. I'll ask for revert. Sorry about that.