This example codes shows a possible reuse of the SF (sign flag) after a left shift (<<) operation. But Clang seems to always generate a 'test' instruction after the shift; it doesn't reuse the condition code of the 'shl' when possible.
Possible smaller code (I show this tweak specifically for Clang; in the GCC bug report I wrote an even smaller one):
my_isxdigit_2:
movl %edi, %ecx
movl $2113994688, %edx
xorl %eax, %eax # Sets SF to 0 if 'shll' does not affect flags.
shll %cl, %edx
js .L1
retq
.L1:
movl $436207616, %eax
shrl $4, %ecx
shll %cl, %eax
shrl $31, %eax
retq
The FLAGS register might be unmodified if the shift count is zero (i.e. value in %cl register is zero), but optimization is still possible if we set the FLAGS to a known state prior to the shift operation. (Thanks GCC developers for hinting this for me.) For example, xorl %eax, %eax sets CF and SF bits to 0, so the my_isxdigit_2 code can work.
This optimization might require tracking on whether the FLAGS states are known. If this is impossible internally in LLVM, feel free to close this report.
This example codes shows a possible reuse of the SF (sign flag) after a left shift (<<) operation. But Clang seems to always generate a 'test' instruction after the shift; it doesn't reuse the condition code of the 'shl' when possible.
(Note: [I also reported this issue in GCC](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114490).)
```c
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
bool my_isxdigit(unsigned char ch) {
uint32_t mask1 = 0x7E00FFC0;
// Prevent the compiler from transforming the constants.
// Suppose we have to use them as written.
__asm__ ("" : "+r" (mask1));
if (!((mask1 << (ch & 0x1F)) >> 31))
return false;
uint32_t mask2 = 0x1A << 24;
__asm__ ("" : "+r" (mask2));
if (!((mask2 << (ch >> 4)) >> 31))
return false;
return true;
}
```
x86-64 Clang with "-O3" option
(Actually I tested this in Compiler Explorer (godbolt.org))
```x86asm
my_isxdigit:
movl %edi, %ecx
movl $2113994688, %eax
shll %cl, %eax
testl %eax, %eax
js .LBB0_2
xorl %eax, %eax
retq
.LBB0_2:
movl $436207616, %eax
shrl $4, %ecx
shll %cl, %eax
shrl $31, %eax
retq
```
Possible smaller code (I show this tweak specifically for Clang; in the [GCC bug report](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114490) I wrote an even smaller one):
```x86asm
my_isxdigit_2:
movl %edi, %ecx
movl $2113994688, %edx
xorl %eax, %eax # Sets SF to 0 if 'shll' does not affect flags.
shll %cl, %edx
js .L1
retq
.L1:
movl $436207616, %eax
shrl $4, %ecx
shll %cl, %eax
shrl $31, %eax
retq
```
The FLAGS register might be unmodified if the shift count is zero (i.e. value in `%cl` register is zero), but optimization is still possible if we set the FLAGS to a known state prior to the shift operation. (Thanks GCC developers for hinting this for me.) For example, `xorl %eax, %eax` sets CF and SF bits to 0, so the `my_isxdigit_2` code can work.
This optimization might require tracking on whether the FLAGS states are known. If this is impossible internally in LLVM, feel free to close this report.
This example codes shows a possible reuse of the SF (sign flag) after a left shift (<<) operation. But Clang seems to always generate a 'test' instruction after the shift; it doesn't reuse the condition code of the 'shl' when possible.
(Note: I also reported this issue in GCC.)
x86-64 Clang with "-O3" option (Actually I tested this in Compiler Explorer (godbolt.org))
Possible smaller code (I show this tweak specifically for Clang; in the GCC bug report I wrote an even smaller one):
The FLAGS register might be unmodified if the shift count is zero (i.e. value in
%cl
register is zero), but optimization is still possible if we set the FLAGS to a known state prior to the shift operation. (Thanks GCC developers for hinting this for me.) For example,xorl %eax, %eax
sets CF and SF bits to 0, so themy_isxdigit_2
code can work.This optimization might require tracking on whether the FLAGS states are known. If this is impossible internally in LLVM, feel free to close this report.