thseiler / embedded

some open source embedded stuff
130 stars 50 forks source link

Too big? #11

Open jtepmach opened 7 years ago

jtepmach commented 7 years ago

Of course thanks so much for your work on this, it's a hugely useful feature.

I'm in the process of adding basic support for SDHC cards, which seem to be all that is available these days. I have a working version based off the code at https://github.com/mharizanov/uIoT which is an old copy of this repo, and, with some code shuffling, it just squeeks in under 2K. Before my changes, that version compiles to 2006 bytes.

Now, I've forked this project, with the intention of making my changes on the latest code, however, before I even start, I can't get it compiled under 2K - I'm getting 2218 bytes. This is all under the same environment etc (OS X), and I can't see that much has changed other than adding 1280 support (which is in an #ifdef).

There also doesn't appear to be any test that the code size matches the allocated boot size, so it's something to constantly watch.

Any tips what might be bloating my compile of the latest version?

thseiler commented 7 years ago

Hi jtheorent,

Thanks for giving 2boots a try! To save executable size, i used the avr-gcc's intermodule optimisation framework (-combine) option. This means that all C files have to be compiled and linked in a single gcc invocation, but this in turn allows the compiler and linker to arrange the code in clever way and use more relative calls/jumps, which are 2 bytes smaller than absolute calls/jumps, saving lots of space.

Newer avr-gcc versions do no longer support the intermodule optimisation framework (-combine), it was replaced by the more generic linktime optimisation framework (-lto), see https://gcc.gnu.org/gcc-4.6/changes.html

However, when I tried to use -lto with avr-gcc 4.8, i always got binaries larger than 2K. I am still uncertain why the lto is unable to optimize as well as -combine did.

For now, I suggest to try using a avr-gcc version < 4.6 , then you should be able to compile 2boots in under 2K.

Regarding size check: good point. However, I didn't see an easy way to use avr-size from the Makefile. Do you have any pointers?

Cheers, Thomas

jtepmach commented 7 years ago

OK, thanks, I'll play around with different gcc versions/flags.

The main difference in the code itself is that @mharizanov re-wrote the file-matching code, it's smaller, but possibly less functional, I'll check it. Also in the 1280 support pull-request @afranchuk changed the address var used by write_flash_page to a global, which seems to add quite a few bytes to the code.

I've cobbled together both sets of changes, and changed the address var back to a function parameter, but typedef'ed it. I'm not sure if it still works for 1280, but it works fine on a 328p.

It's still a work in progress, but my version now compiles to around 1990 bytes, and supports SDHC cards (but possibly only the one I'm testing with!). It's slightly less functional in that the entire filename (including extension) must be specified in EEPROM, and currently probably no longer supports MMC, but I'll add that back, and #ifdef some of the decisions eg whether to match extension.

I'm not sure about checking size either, I'll take a look.

afranchuk commented 7 years ago

My pull request did not change the address variable scope at all. It did possibly change how large it was, but it was already global scope.

jtepmach commented 7 years ago

Ah yes, apologies @afranchuk, it wasn't in your PR, but address was changed from a function parameter to a static/global var at some point (or possibly vice versa), and the global form seems to produce a larger binary.

thseiler commented 7 years ago

Apologies, the global / local inconsistencies seem to be a finger trouble with a git command on my side.

@jtheorent : I think I figured out the correct flags to use for -flto. I pushed a new "newer-gcc-support-experimental" branch with updated CFLAGS/LDFLAGS. This branch also uses local variables for address, length, and eeprom_flag and compiles without warning on my machine down to 1924 bytes for 2boots-arduino-atmega328-16000000L-PD4.hex (avr-gcc V6.2.0 on fedora 25). Warning: I was not yet able to test this on actual hardware, as my ISP proggr broke down. I pushed it in case you want to give it a try before my replacement proggr arrives.

Cheers, Thomas

jtepmach commented 7 years ago

Nice!, thanks @thseiler. I've merged your updates into my latest, and it seems to work, at least on my hardware.

I've made quite a few code-size changes on my fork, eg rewriting send_cmd to use parameters, rewriting gethexnib, the address functions to use paramters (which you have here already) and using @mharizanov version of the file reader (though it's limited to full 8.3 filenames currently (I'l fix that)), in addition to adding very basic (probably flaky) SD card support.

For 2boots-arduino-atmega168-16000000L-PORTD4.hex I now get 1770 bytes, and this is with SDcard support, but no fallback to MMC support yet - I'll fix that. With just MMC support I get 1736 bytes, though not sure if it works as I don't have any MMC cards. This is using gcc 4.9.2 (as shipped with arduino 1.8.1) on os X 10.11. Using gcc 6.2 increases the sizes by about 20 bytes.