Closed michalsc closed 11 months ago
if you solve this properly you also have to consider cases without regparam
.
gcc transforms this often and since the 680** don't have a single instruction for both steps:
d1:QI=[sp:SI+0xb]
d1:SI=sign_extend(d1:QI)
d0:SI=[sp:SI+0x4]
d0:SI=d0:SI<<d1:SI
use d0:SI
Thus you have to handle cases where there are instructions inbetween the sign_extend
and the shift
.
It can be done, just needs time...
yeah, there it is. Now, the question is, do we need this:
d1:SI=sign_extend(d1:QI)
or can we just convince gcc to use d1:QI
in any case? Ah, jus noticed the gcc rtl way of writing it: can we tell gcc to leave d1:SI, d1:QI or d1:HI intact for shifts? Or can we just tell gcc that m68k supports shifts by QI, HI and SI operands and just declare them all as the very same ASL/ASR and so on?
please test, this http://franke.ms/cex/z/qGfbec looks ok now.
Yeah, on this example it looks great. However, the same should apply also to shifts right and it doesn't yet (unless this was just a proof of concept to see if you are on right track).
BTW: What I have also noticed is that gcc does not emit other LSL/LSR sizes as long. E.g. following code
unsigned short lsl_b(unsigned short a, char b)
{
return a >> b;
}
unsigned short lsl_w(signed short a, short b)
{
return (unsigned short)(a << b);
}
will be compiled to
__Z5lsl_btc:
and.l #65535,d0
extb.l d1
asr.l d1,d0
rts
__Z5lsl_wss:
ext.l d0
lsl.l d1,d0
rts
where asr.l
or lsl.l
could be simply be replaced with asr.w
or lsl.w
without actually needing to mask or sign extend the operand in d0
register.
available in ~36 mins
The used instructions do not contain shifts for shorter modes. Maybe that can be added, but I'm not sure about this...
well, the main problem is, that C/C++ converts everything to int before the shift gets applied.
_2 = (int) a;
_4 = (int) b;
_5 = _2 >> _4;
_6 = (short unsigned int) _5;
return _6;
please test: http://franke.ms/cex/z/4Teznb
Looks great for me! Thanks!
I've also checked ROL/ROR and they seem to behave correctly too.
This is actually not related to Your changes to gcc, this is a generic gcc issue when generating m68k code. In case of shifts and rotations where count is given by register, gcc always extends the count to 32-bit operand, generating unnecessary code. The shift/rotate count is always taken modulo 64 so it does not matter if it is specified as 8, 16 or 32-bit variable.
What we have now is following:
compiled with
-Os -fomit-frame-pointer -m68020 -mregparm=2
gives:both
extb.l d1
andext.l d1
are unnecessary. Expected result would be:The same issue applies for all rotate/shift operations and gcc adds unnecessary signed/unsigned extension to long in all cases.