esp8266 / Arduino

ESP8266 core for Arduino
GNU Lesser General Public License v2.1
15.98k stars 13.34k forks source link

binary size huge difference on 2 separate systems #7209

Closed philbowles closed 4 years ago

philbowles commented 4 years ago

Basic Infos

Platform

Settings in IDE

H4_Wemos_d1mini.name=H4 Optimised Wemos D1 Mini H4_Wemos_d1mini.build.board=ESP8266_WEMOS_D1MINI H4_Wemos_d1mini.build.variant=d1_mini H4_Wemos_d1mini.upload.tool=esptool H4_Wemos_d1mini.upload.maximum_data_size=81920 H4_Wemos_d1mini.upload.wait_for_upload_port=true H4_Wemos_d1mini.upload.erase_cmd= H4_Wemos_d1mini.serial.disableDTR=true H4_Wemos_d1mini.serial.disableRTS=true H4_Wemos_d1mini.build.mcu=esp8266 H4_Wemos_d1mini.build.core=esp8266 H4_Wemos_d1mini.build.debug_port= H4_Wemos_d1mini.build.debug_level= -DNDEBUG H4_Wemos_d1mini.menu.xtal.80=80 MHz H4_Wemos_d1mini.menu.xtal.80.build.f_cpu=80000000L H4_Wemos_d1mini.menu.xtal.160=160 MHz H4_Wemos_d1mini.menu.xtal.160.build.f_cpu=160000000L H4_Wemos_d1mini.build.vtable_flags=-DVTABLES_IN_FLASH H4_Wemos_d1mini.build.exception_flags=-fno-exceptions H4_Wemos_d1mini.build.sslflags=-DBEARSSL_SSL_BASIC H4_Wemos_d1mini.upload.resetmethod=--before default_reset --after hard_reset H4_Wemos_d1mini.build.flash_mode=dio H4_Wemos_d1mini.build.flash_flags=-DFLASHMODE_DIO H4_Wemos_d1mini.build.flash_freq=40 H4_Wemos_d1mini.build.flash_size=4M H4_Wemos_d1mini.build.flash_size_bytes=0x400000 H4_Wemos_d1mini.build.flash_ld=eagle.flash.4m1m.ld H4_Wemos_d1mini.build.spiffs_pagesize=256 H4_Wemos_d1mini.upload.maximum_size=1044464 H4_Wemos_d1mini.build.rfcal_addr=0x3FC000 H4_Wemos_d1mini.build.spiffs_start=0x300000 H4_Wemos_d1mini.build.spiffs_end=0x3FB000 H4_Wemos_d1mini.build.spiffs_blocksize=8192 H4_Wemos_d1mini.build.lwip_include=lwip2/include H4_Wemos_d1mini.build.lwip_lib=-llwip2-1460 H4_Wemos_d1mini.build.lwip_flags=-DLWIP_OPEN_SRC -DTCP_MSS=1460 -DLWIP_FEATURES=0 -DLWIP_IPV6=0 H4_Wemos_d1mini.menu.wipe.none=Only Sketch H4_Wemos_d1mini.menu.wipe.none.upload.erase_cmd= H4_Wemos_d1mini.menu.wipe.sdk=Sketch + WiFi Settings H4_Wemos_d1mini.menu.wipe.sdk.upload.erase_cmd=erase_region "{build.rfcal_addr}" 0x4000 H4_Wemos_d1mini.menu.wipe.all=All Flash Contents H4_Wemos_d1mini.menu.wipe.all.upload.erase_cmd=erase_flash H4_Wemos_d1mini.upload.speed=115200 H4_Wemos_d1mini.build.float=

Problem Description

I have a repo that compiles to about 430kb on pretty much any ESP8266 target. A user downloaded it and his binary is over 120kb larger: well over 500k (and hence non OTA-able!)

For the record, his original target was nodeMCU v1.0 and all build menu settings appeared identical on both our systems.

Both our compilation outputs show exactly the same libraries all at exactly the same versions, i.e. the only thing I can see different is the final executable size. Our Arduino IDE versions are both 1.8.12 both using core 2.6.3...on the face of it, apart from the OS our environments are the same. We have spent the last 4 or 5 hours uninstalling, reinstalling things on his system, and checking that all build settings are indeed 100% identical, yet nothing we do will reduce his binary below 500k+

I'm running Windows10, but tried also on Win7 with 430kb again. He's running Windows10 pro and oddly also gets 500kb+ using PlatformIO. Other than the OS, it appears that everything else is identical.

The above settings boards.txt was used to guarantee identical settings: His problem started using nodeMCU 1.0 - my comparison were done also on nodeMCU 1.0, i.e. the actual board / settings chosen don't seem to make much difference: I'm always 420-ish, he's always 540-ish.

What are we missing and/or what is wrong with his setup? What can be adding 120kb to his binary? (PS when posting the comparison photos, I just noticed my heap is 41% and his is 52% ... same question)

My system

image

His system

image

earlephilhower commented 4 years ago

There's not enough for us to do anything about here. Check submodule versions match, LWIP versions match, etc.

If you have both ELF files you can do a xtensa-lx106-elf-objdump.exe -t file.elf on each of them. That will give you a list of each function's size which you can then narrow things down with.

philbowles commented 4 years ago

"There's not enough for us to do anything about here" a tad harsh! :) You have at lest given me some ideas. Isnt lwip included in the core build? Given that they both use same core, I can't see how they would differ...but even if they did, 120k?

Where would I find submodule versions? I will see if my user is "up to" doing the objdump on his copy PS thanks for such rapid reply

earlephilhower commented 4 years ago

Sorry! Didn't mean to be harsh, but a screenshot of 2 IDEs isn't really going to help us.

git submodule status will print the submodule revisions

I would also change the IDE to print verbose compile output. That will let you check that the compiler has the same options (i.e. debug, etc.).

philbowles commented 4 years ago

You weren't harsh, I was being tongue-in-cheek....but you do have the complete build options set too - what else could I provide?

Again the IDE is the same, the core is the same, no-one has tweaked anything under the hood - how could the compile settings be different? (I always compile verbose anyway :) )

If I get the user to zip and mail me his export .bin file, (I'm in France, he's in Jordan) would the objdump work on that? And sorry to be dim (Windows user) where do I do the git submodule status ?

philbowles commented 4 years ago

...was just hoping a sudden jump of 120kb might trigger someones memory, At +25% increase we are not talking compiler optimisation sizes, are we?

earlephilhower commented 4 years ago

No .bin file, just file.elf. The .bin is the binary image and can't be used to get function sizes/etc as it strips all that out. ELF format has everything you need to see functions, sizes, etc.

You've listed the boards.txt settings, but to see what the compiler is really being fed you need to use the File->Preferences->Show Verbose Output->Compilation which will allow you to cut-n-paste the actual C++ compiler command line which takes all settings from boards.txt/platform.txt/IDE.

If you're running w/the IDE installed versions of the core, then git submodule status won't work, unfortunately.

Nothing rings a bell, sorry, with an increase in code size between 2 installed with nominally the same options and code sources. There's some code increase if you have floats disabled, but I don't think we're talking 120KB worth (~4K IIRC).

philbowles commented 4 years ago

As I thought, so I/we will have to delve into the tmp directory? Also will get / compare full compiler output to at least see which segment is getting fatter to narrow down the search...it's driving me nuts!

Thanks so far - will keep you informed of any discovery / progress. Fingers crossed.

philbowles commented 4 years ago

Progress. I have the elf of the 500k compile. Never having used objdump before, I'm overwhelmed by the options and jargon - is there an ideal minimum set to produce something that might suggest why the irom0.text section is the one with the problem? size = 507500

d-a-v commented 4 years ago

Both output from readelf.exe -a file.elf need to be compared (./tools/xtensa-lx106-elf/xtensa-lx106-elf/bin/readelf.exe)

philbowles commented 4 years ago

First sign of light the "fat" binary has over 2x as many section headers - is he pulling in something twice?

My ELF:

ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: Tensilica Xtensa Processor Version: 0x1 Entry point address: 0x401000b8 Start of program headers: 52 (bytes into file) Start of section headers: 10957672 (bytes into file) Flags: 0x300 Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 5 Size of section headers: 40 (bytes) Number of section headers: 1006 Section header string table index: 1005

His ELF

ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: Tensilica Xtensa Processor Version: 0x1 Entry point address: 0x401000b8 Start of program headers: 52 (bytes into file) Start of section headers: 13831956 (bytes into file) Flags: 0x300 Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 5 Size of section headers: 40 (bytes) Number of section headers: 2125 Section header string table index: 2124

philbowles commented 4 years ago

By Jove, I think I might have it, Watson! I think it's a locale thing. I did some work in excel and here is a selection of the things in his binary that are not in mine: they mostly appear to be related to Wchar and punct this and currency that - and he's in Jordan and I bet his OS is set to Arabic script and I bet that pulls in a ton of extra double-char routines tables etc...

IF I'm right, then 120k seems to be too big a price to pay, and should be on/off-able in the IDE!

abs
abs.c
blanks$4247
btowc
btowc.c
c++locale.cc
class_type_info.cc
close
codecvt.cc
codecvt_members.cc
collate_members.cc
ctype.cc
ctype_configure_char.cc
ctype_members.cc
dtoa.c
dyncast.cc
eh_aux_runtime.cc
fpi$2657
fpinan$2693
fstat
gdtoa-gethex.c
gdtoa-hexnan.c
gettzinfo.c
ios.cc
ios_locale.cc
ios-inst.cc
iso_year_adjust
iso_year_adjust
istream-inst.cc
iswalnum
iswalnum.c
iswalpha
iswalpha.c
iswblank
iswblank.c
iswcntrl
iswcntrl.c
iswctype
iswctype.c
iswdigit
iswdigit.c
iswgraph
iswgraph.c
iswlower
iswlower.c
iswprint
iswprint.c
iswpunct
iswpunct.c
iswspace
iswspace.c
iswupper
iswupper.c
iswxdigit
iswxdigit.c
L_shift
labs
labs.c
ldpart.c
locale.cc
locale_buf_C$2184
locale_facets.cc
locale_init.cc
locale-inst.cc
match
mbrtowc
mbrtowc.c
mbstowcs
mbstowcs.c
mbtowc_r.c
messages_members.cc
monetary_members.cc
mprec.c
nanf
nano-vfscanf.c
nano-vfscanf_i.c
num_lines$2185
numeric_members.cc
open
ostream-inst.cc
p05$2692
quorem

read
rshift
s_fpclassify.c
sccl.c
sf_nan.c
si_class_type_info.cc
siscanf
sonoff-h4.ino.cpp
sscanf
sscanf.c
sstream-inst.cc
strcat
strcat.c
strcoll
strcoll.c
streambuf-inst.cc
strftime
strftime.c
strtod
strtod.c
strtof
strtoul
strtoul.c
strxfrm
strxfrm.c
sulp
swprintf
swprintf.c
sysclose.c
sysfstat.c
sysopen.c
sysread.c
time_locale_buf
time_members.cc
timelocal.c
tinfo.cc
tinytens
towlower
towlower.c
towupper
towupper.c
tree.cc
tzinfo
tzlock.c
tzvars.c
ungetc
ungetc.c
unwind-dw2.c
vfwprintf.c
vmi_class_type_info.cc
wcscmp
wcscmp.c
wcscoll
wcscoll.c
wcscpy
wcscpy.c
wcsftime
wcsftime.c
wcslcpy
wcslcpy.c
wcslen
wcslen.c
wcstoul
wcstoul.c
wcsxfrm
wcsxfrm.c
wctob
wctob.c
wctype
wctype.c
wcvt
wlocale-inst.cc
wmemchr
wmemchr.c
wmemcpy
wmemcpy.c
wmemmove
wmemmove.c
wmemset
wmemset.c
wstring-inst.cc
zeroes$4248
earlephilhower commented 4 years ago

objdump -t file.elf will give you a list of functions and sizes (in hex) that you can load into Excel and sort relatively easily.

The source files used to build, I would not rely too much on. You can have lots of file in the link that are thrown away in the final binary.

I don't think the current locale has any effect on the Arduino binary. If you're using locale calls in your app, and he's changed a compile-time option somewhere from "French" to "Arabic", well that's not the same binary anymore, is it? :)

philbowles commented 4 years ago

I do nothing in the app with locales - being English I automatically assume everyone else speaks it, so make no allowances :). His binary calls wide character functions that I have never used in my life and never heard of thus - as respectfully as possible since you are the expert, - I tend to disagree with: "I don't think the current locale has any effect on the Arduino binary" - and if I'm wrong then something else is doing it! Since I have seen all of his settings, I know he has not changed / tweaked settings in the IDE or build system, so if you are right, why is his binary pulling in wide char functions when NOTHING in any of my own code has anything to do with them?

I compared the verbose compiler output and the flags and option and settings are 100% identical, everything else IDE + build wise is identical, so it is most definitely something outside of the build system causing it - the only other differences between our configurations is the timezone and possibly the locale, and it ain't the timezone!

Thanks for the -t flag thing , I will try that now and report back.

philbowles commented 4 years ago

His binary pulls in c++locale.cc - mine does not: this is one of the reasons I am disagreeing with you: image Also his pulls in all these wide functions that mine does not: What is causing it if not his sytem locale / character set? image

philbowles commented 4 years ago

...and of course "it's not the same binary" - that's the very reason we are here, but it is the same source, the same libraries, same IDE, same core, same compiler, same build settings - and that is the point - my quest is to find what can be causing such a different binary, give those identical inputs.

earlephilhower commented 4 years ago

If it's locale based, which I still find highly unlikely, you should be able to reproduce on your local install by seeing these locale variable to whatever he had set.

Again, without MCVE code or anything more detailed we're just guessing here and not able to reproduce it or debug, really.

earlephilhower commented 4 years ago

You could also get on stack overflow and see if this is a known feature of GCC or it's toolchain. Locale is an environment variable (read at runtime, not compile time).

philbowles commented 4 years ago

I appreciate your thoughts on this, but the issues are these: its a large and complex project image But, Will get him to compile a "blinky" and compare sizes to find an MCVE that demonstrates the problem.

Secondly, since I have no experience of machines in other language, I may be using localel wrongly - I take your word for its operation at runtime. What I'm seeking is the compile-time environment "variable" that causes this behaviour. Something is "injecting" wide-character and i18n code into the binary that is not part of the project source - irrespective of which of us is right about the exact mechanism.

Finally , I like the StackOverflow idea so much that I did it last night :) . Will keep you posted. Thanks again for you thoughts so far.

philbowles commented 4 years ago

FWIW, here is an example of the compile-time options (same on both our machines, of course)

Compiling library "h4plugins-0.5.4"
"C:\\Users\\Hamza\\AppData\\Local\\Arduino15\\packages\\esp8266\\tools\\xtensa-lx106-elf-gcc\\2.5.0-4-b40a506/bin/xtensa-lx106-elf-g++" -D__ets__ -DICACHE_FLASH -U__STRICT_ANSI__ "-IC:\\Users\\Hamza\\AppData\\Local\\Arduino15\\packages\\esp8266\\hardware\\esp8266\\2.6.3/tools/sdk/include" "-IC:\\Users\\Hamza\\AppData\\Local\\Arduino15\\packages\\esp8266\\hardware\\esp8266\\2.6.3/tools/sdk/lwip2/include" "-IC:\\Users\\Hamza\\AppData\\Local\\Arduino15\\packages\\esp8266\\hardware\\esp8266\\2.6.3/tools/sdk/libc/xtensa-lx106-elf/include" "-IC:\\Users\\Hamza\\AppData\\Local\\Temp\\arduino_build_647965/core" -c -w -Os -g -mlongcalls -mtext-section-literals -fno-rtti -falign-functions=4 -std=gnu++11 -MMD -ffunction-sections -fdata-sections -fno-exceptions -DBEARSSL_SSL_BASIC -DNONOSDK22x_190703=1 -DF_CPU=80000000L -DLWIP_OPEN_SRC -DTCP_MSS=1460 -DLWIP_FEATURES=0 -DLWIP_IPV6=0 -DNDEBUG -DARDUINO=10812 -DARDUINO_ESP8266_WEMOS_D1MINI -DARDUINO_ARCH_ESP8266 "-DARDUINO_BOARD=\"ESP8266_WEMOS_D1MINI\"" -DFLASHMODE_DIO -DESP8266 "-IC:\\Users\\Hamza\\AppData\\Local\\Arduino15\\packages\\esp8266\\hardware\\esp8266\\2.6.3\\cores\\esp8266" "-IC:\\Users\\Hamza\\AppData\\Local\\Arduino15\\packages\\esp8266\\hardware\\esp8266\\2.6.3\\variants\\d1_mini" "-IC:\\Users\\Hamza\\Documents\\Arduino\\libraries\\h4plugins-0.5.4\\src" "-IC:\\Users\\Hamza\\Documents\\Arduino\\libraries\\H4-master\\src" "-IC:\\Users\\Hamza\\AppData\\Local\\Arduino15\\packages\\esp8266\\hardware\\esp8266\\2.6.3\\libraries\\SoftwareSerial\\src" "-IC:\\Users\\Hamza\\AppData\\Local\\Arduino15\\packages\\esp8266\\hardware\\esp8266\\2.6.3\\libraries\\ESP8266WiFi\\src" "-IC:\\Users\\Hamza\\AppData\\Local\\Arduino15\\packages\\esp8266\\hardware\\esp8266\\2.6.3\\libraries\\ESP8266mDNS\\src" "-IC:\\Users\\Hamza\\Documents\\Arduino\\libraries\\ESPAsyncTCP-master\\src" "-IC:\\Users\\Hamza\\Documents\\Arduino\\libraries\\ESPAsyncUDP-master\\src" "-IC:\\Users\\Hamza\\AppData\\Local\\Arduino15\\packages\\esp8266\\hardware\\esp8266\\2.6.3\\libraries\\DNSServer\\src" "-IC:\\Users\\Hamza\\AppData\\Local\\Arduino15\\packages\\esp8266\\hardware\\esp8266\\2.6.3\\libraries\\ArduinoOTA" "-IC:\\Users\\Hamza\\Documents\\Arduino\\libraries\\ESPAsyncWebServer\\src" "-IC:\\Users\\Hamza\\AppData\\Local\\Arduino15\\packages\\esp8266\\hardware\\esp8266\\2.6.3\\libraries\\Hash\\src" "-IC:\\Users\\Hamza\\Documents\\Arduino\\libraries\\async-mqtt-client-master\\src" "-IC:\\Users\\Hamza\\AppData\\Local\\Arduino15\\packages\\esp8266\\hardware\\esp8266\\2.6.3\\libraries\\ESP8266httpUpdate\\src" "-IC:\\Users\\Hamza\\AppData\\Local\\Arduino15\\packages\\esp8266\\hardware\\esp8266\\2.6.3\\libraries\\ESP8266HTTPClient\\src" "C:\\Users\\Hamza\\Documents\\Arduino\\libraries\\h4plugins-0.5.4\\src\\H4P_ExternalSqWave.cpp" -o "C:\\Users\\Hamza\\AppData\\Local\\Temp\\arduino_build_647965\\libraries\\h4plugins-0.5.4\\H4P_ExternalSqWave.cpp.o"
philbowles commented 4 years ago

Got the culprit, although still not the mechanism. We started at empty sketch, up through blinky and on to my repo using many examples...everything has identical binary sizes on our two systems.

Then we pulled in a example using @me-no-dev AsyncWebServer and both using the simple example: Boom! his binary is 120kb larger, so there's your MCVE.

However, it is now obvious that the AsyncWebserver repo is the place for this question, so I shall thank you once again for you patience and insight - as I never would have got to this point without it - and move the question elsewhere.

Somebody got there before me: https://github.com/me-no-dev/ESPAsyncWebServer/issues/613 it's a std::regex thing

earlephilhower commented 4 years ago

Thanks, @philbowles ! Good debug. Looks like MCVEs do make it easier to debug. :)

earlephilhower commented 4 years ago

Also, please note that on the 8266 the g++ std::regex library from 4.8 is broken. You'll need to use the GCC 9.3 PR or wait for a 3.0 version to get a fully working one. @d-a-v has a repo with an alpha version of the release IIRC which contains it already if you're not using git.