dinuxbg / gnupru

GCC and Binutils port for the TI PRU I/O processor
89 stars 11 forks source link

Missed optimization: Backend fails to zero-extend using instruction operands #9

Closed dinuxbg closed 8 years ago

dinuxbg commented 9 years ago

The following C snippet:

  unsigned int x = (unsigned int)a + (unsigned char)b;

Will compile to the inefficient (but still correct):

   mov r15, r15.b0
   add r16, r14, r15

Instead of the more-optimal:

   add r16, r14, r15.b0
OctalS commented 9 years ago

How much cpu cycles are in the first and second case ?

dinuxbg commented 9 years ago

Two an one cycle, respectively.

All PRU instructions take one cycle to execute, except the load/store instructions.

dinuxbg commented 9 years ago

Looks like AArch64 is having the same issue: http://www.slideshare.net/linaroorg/hkg15405-redundant-zerosignextension-elimination-in-gcc . On to reading the aarch64 machine description...