openwch / arduino_core_ch32

Core library for CH32duino
248 stars 41 forks source link

Consider checking compiler and linker optimization flags - even simple blink sketch produces very long binary #25

Open Gjorgjevikj opened 12 months ago

Gjorgjevikj commented 12 months ago

Consider checking compiler and linker optimization flags A simple sketch produces very long binary!

Arduino 1.8.19 Board CH32V00x Board Select: CH32V003F4 EVT Optimize: Smallest (-0s default) Debug symbols and core Iogs: None C Runtime Library, Newlib Nano (default) Upload method: WCH-SWD

target : CH32V00x Program: A simple blink sketch (defining two pins as outputs, and blinking both of them in the loop using delay) Result: Program CH32V003_Blink size: 10.400 bytes (used 63% of a 16.384 byte maximum) (1,82 secs) Minimum Memory Usage: 520 bytes (25% of a 2048 byte maximum)

The same sketch using Arduino Core for the CH32V003 produces only 1KB code.

maxint-rd commented 11 months ago

Tested this yesterday and must agree. I saw that next to "smallest" one can also select smallest with LTO (link time optimization), which produces a smaller binary. However, still more than 50% of the 16K and not very promising for larger sketches...

Perhaps the hardware abstraction layer can be made smaller or avoided? (kudo's to Alexander Mandera and CNLohr)

maxint-rd commented 11 months ago

I did some digging and found that even when using the option "Smallest (-Os) with LTO", the binary still contained some core_debug() calls and their associated usage of space consuming strings. To fix this I modified /packages/WCH/hardware/ch32v/1.0.3/boards.txt (at about line#80): CH32V00x_EVT.menu.dbg.none.build.flags.debug=-DNDEBUG I got the impression that omitting this flag was just a silly mistake by the developer.

This fix made a significant improvement. For a test sketch that uses some serial communication and printf(), the binary size decreased from 11688 bytes (71%) to 9464 bytes (57%). For the standard Arduino Blink example the improvement was even more impressive: from about 9 kB (with LTO) to 3584 bytes (21%). (BTW. RAM usage for globals is 316 bytes (15%)). EDIT: note that not using printf also saves quite some room...

Nice to see some room for quick wins. It's almost getting usable. If only the rest of the core would be slightly more compatible to the original Arduino core... (E.g. why is Serial.available() hardcoded to always return -1 ?)

maxint-rd commented 11 months ago

See also pull request #28

maxint-rd commented 10 months ago

I see in this commit that the boards.txt was updated. Thanks! (I closed my pull request accordingly)

maxint-rd commented 5 months ago

Did some more testing with IDE v2.3.2 and core v1.0.4. It seems the LTO option isn't working anymore. (Even standard blink causes the processor to hang). So -Os is smallest for now.

However, if you don't use serial, disabling the UART module disables debug printf and saves some more flash. To disable edit the file /variants/CH32V00x/CH32V003/variant_CH32V003.h and comment line 17:
//#define UART_MODULE_ENABLED This makes sure some larger bits of code are not compiled. Note that when not using SPI, disabling SPI_MODULE_ENABLED doesn't seem to make a difference.

Using -Os and UART disabled to compile the standard Blink.ino example for the CH32V003; the sketch uses 2684 bytes Flash (16%) and 240 bytes (11%) for globals. Both quite a bit less than the previous test with LTO and an earlier core in IDE v1.81.19, so disabling UART may be worthwhile for devices that don't use serial (for instance an I2C slave device on the CH32V003 SOP8).

FYI: the BareMinimum example with empty setup() and loop() required only 1080 bytes Flash (6%) and 224 bytes (10%) for globals. Although as an application it's not very useful, it may be interesting to delve a bit deeper... The BareMinimum.ino.map file shows 1008 bytes in .text. Largest entries are:

    system_ch32yyxx.c:SystemInit [144 bytes]
    startup_ch32yyxx.S:handle_reset [132 bytes]
    libgcc:div [126 bytes]
    libc_nano:lib_a-__atexit:__register_exitproc [122 bytes]

Trying the no longer functioning LTO option to compile the standard Blink.ino resulted in 872 bytes (5%) Flash and 152 bytes (7%) globals. Looking at the .map file showed that none of the sketch code was included. Perhaps some clue can be found there...