open-watcom / open-watcom-v2

Open Watcom V2.0 - Source code repository, Wiki, Latest Binary build, Archived builds including all installers for download.
Other
989 stars 162 forks source link

Missed optimization in shifting `long long` on word boundary #1218

Open jwt27 opened 9 months ago

jwt27 commented 9 months ago

See example:

unsigned short shr16(unsigned short x)
{
  return x >> 8;
}

unsigned long shr32(unsigned long x)
{
  return x >> 16;
}

unsigned long long shr64(unsigned long long x)
{
  return x >> 32;
}
Segment: _TEXT PARA USE16 00000015 bytes
0000                            shr16_:
0000  88 E0                             mov             al,ah
0002  30 E4                             xor             ah,ah
0004  C3                                ret
0005  FC                                cld

Routine Size: 6 bytes,    Routine Base: _TEXT + 0000

0006                            shr32_:
0006  89 D0                             mov             ax,dx
0008  31 D2                             xor             dx,dx
000A  C3                                ret
000B  FC                                cld

Routine Size: 6 bytes,    Routine Base: _TEXT + 0006

000C                            shr64_:
000C  56                                push            si
000D  BE 20 00                          mov             si,0x0020
0010  E8 00 00                          call            __U8RS
0013  5E                                pop             si
0014  C3                                ret

Routine Size: 9 bytes,    Routine Base: _TEXT + 000C

It generates the expected code for 16- and 32-bit ints, but 64-bit always involves the slow library call.