FrenkelS / djdoom

Compiling Doom for DOS with various compilers.
GNU General Public License v2.0
16 stars 1 forks source link

Optimized FixedMul/div #12

Open RamonUnch opened 1 year ago

RamonUnch commented 1 year ago
#if defined(__GNUC__) && defined (__i386__)
fixed_t FixedMul(fixed_t a, fixed_t b)
{
    __asm__ (
        "imul %2 \n"
        "shrd $0x10,%%edx,%%eax \n"
    : "=eax" (a)         /* OUTPUT  */
    : "eax" (a), "r" (b) /* INPUT   */
    : "edx"              /* CLOBBER */
    );
    return a;
}
#else
fixed_t FixedMul (fixed_t a, fixed_t b)
{
    return ((int64_t) a * (int64_t) b) >> FRACBITS;
}
#endif
RamonUnch commented 1 year ago
static fixed_t FixedDiv2(fixed_t a, fixed_t b)
{
    __asm__ (
      "  cltd\n"
      "  shld $16, %%eax, %%edx\n"
      "  shl  $16, %%eax\n"
      "  idiv %2\n" // div by b
      : "=a" (a)           /* OUTPUT  */
      : "a" (a), "r" (b)   /* INPUT   */
      : "edx"              /* CLOBBER */
    );
    return a;
}
fixed_t FixedDiv (fixed_t a, fixed_t b)
{
    return (((abs(a) >> 14) >= abs(b)) ? (((a) ^ (b)) >> 31) ^ MAXINT : FixedDiv2(a, b));
}
RamonUnch commented 1 year ago

on demo1 1710 ticks 1995 realticks -> 1979 realticks with better FixedMul -> 1968 realticks with also improved FixedDiv

FrenkelS commented 1 year ago

The unwritten goal of this project is to see which compiler generates the fastest code. Using assembly does not contribute to that goal.

But since you're not the first one who's suggesting to use assembly, I'm going to change the goal to:

I still want to be able to build only using C, so I'm going to use the flag C_ONLY like in the source code of Quake 2.

RamonUnch commented 1 year ago

Very interesting concept indeed!

RamonUnch commented 1 year ago

Would some specific compilers attribute go into your scope? We could mark some hot/cold functions with the hot and cold gcc specific attributes? This would help the optimizer. to group together the hot function and to optimize for size the cold functions. improving cache locality.

We could also use some hints ie: __builtin_expect(x, y) to help branch ordering with gcc and some other compilers as well.

Also the pure and the const gss specific function attributes might be of some help.

Also the inline keyword could be of some use especially for older compilers.

All of those could be macros that only get expended if relevant: ie A_HOT, A_COLD, A_PURE, A_CONST, likely()/unlikely() etc...

FrenkelS commented 1 year ago

Sure, if other compilers don't mind. ~For example, in i_ibm.c the keyword _interrupt is used for I_KeyboardISR(), but DJGPP doesn't understand this keyword so I've added an empty macro in compiler.h for DJGPP: #define _interrupt.~

For example, in i_ibm.c the variable destview has the __attribute__ ((externally_visible)), but only DJGPP needs this so I've added an empty macro in compiler.h for the other compilers: #define __attribute__(x).