basilhussain / stm8-bootloader-erase-write

Replacement open-source erase/write RAM routines for STM8 ROM bootloader, intended for use with stm8gal.
Apache License 2.0
5 stars 1 forks source link

Use structs for bit flags in status variables #4

Open basilhussain opened 3 years ago

basilhussain commented 3 years ago

To enhance code clarity, I think it will be a good idea to use bitfield structs for the status flag variables (e.g. global_0x8e). For example, instead of if(global_0x8e & (1 << 4)), do if(global_0x8e.erase_full). This communicates meaning much better.

An attempt was already made in the code to do this, but unfortunately it had the side effect of increasing the compiled code size due to lack of optimisation by SDCC. Currently, this is not used by default, and is restricted by a #define.

To give an example of the non-optimal code, an if-test on global_0x8e & (1 << 4)) is compiled to:

ldw x, #_global_0x8e+0
ld  a, (x)
swap    a
and a, #0x01
jreq    00136$

This can be more compactly written as:

btjf _global_0x8e, #4, 00136$

This problem can be solved by formulating some custom peephole optimiser definitions for SDCC. For instance, the above example could be handled by a rule such as this:

replace restart {
    ldw x, #(%1 + %3)
    ld  a, (x)
    swap    a
    and a, #0x01
    jreq    %2
} by {
    btjf    %1+%3, #4, %2
} if notUsed('a' 'x' 'n' 'z')

(Note that multiple arguments to notUsed() are an upcoming feature in SDCC 4.1.0, so for use with earlier versions, it will need to be changed to multiple instances of notUsed(), one per argument.)

basilhussain commented 3 years ago

Hmm, a spanner in the works is that SDCC sometimes uses different formatting for the emitted assembly instructions.

Sometimes it is ldw x, #_variable+0, and sometimes it is ldw x, #(_variable + 0). Note the added parentheses and spacing! Under what circumstances each variant is emitted by the compiler is yet to be determined.

It may require a duplicate set of rules to handle both cases. :(

basilhussain commented 3 years ago

Feature request for more homogeneous assembly code formatting filed: https://sourceforge.net/p/sdcc/feature-requests/728/

basilhussain commented 3 years ago

Another example along the same lines, where the code for a conditional operation involving a bitfield could be smaller, is for busy/polling loops. For instance, when a register is being polled for completion of an operation, like so:

while(!(FLASH_IAPSR & (1 << FLASH_IAPSR_EOP)));

This currently compiles to the following assembly code:

00101$:
    ld a, 0x505f
    bcp a, #0x04
    jreq 00101$

This could be replaced with the following, which is 2 bytes smaller (7 vs. 5) and one cycle quicker (3/4 vs. 2/3, depending on if jump taken):

00101$:
    btjf 0x505f, #2, 00101$

A custom peephole optimizer rule could also be formulated to make this optimisation.