Closed GunnarVB closed 8 months ago
you might consider using -fno-tree-ch
to avoid the duplication of loop conditions.
jra .L2
.L3:
clr.b (a0)+
.L2:
dbra d0,.L3
The O2 version
_memclr:
move.w d0,d1
subq.w #1,d1
tst.w d0
jeq .L1
.L5:
clr.b (a0)+
dbra d1,.L5
using your proposed extra "flag"
_memclr:
dbra d0,.L3
.L6:
rts
.L3:
clr.b (a0)+
dbra d0,.L3
jra .L6
Could the way the Os goes be done always?
move.w d0,d1 subq.w #1,d1 tst.w d0 jeq .L1
next question : Why is there a TST instruction in this code? The tst is not needed, it could be done like this?
move.w d0,d1 jeq .L1 subq.w #1,d1
Hallo Bebbo, I hope you are OK.
Maybe this was reported before.
I found that the basic loop construct loop very good when compiled with -Os but looks not optimal when compiled with any other O mode
C Example:
compile with -mregparm=2 -Os
Good result! 4 instructions total. Bra to DBRA - this is both short and fast.
not optimal result when compile with -O2 or -O3 -OFast
8 instructions total 4 instruction header instead 1 BR This result is not good. the BRA to the DBRA was much better
Hello Bebbo, do you know a way to enable the BRA to the DBRA in all -O options? This would be very good for code size and for performance on all 68K members!
Many thanks in advance
regards Gunnar