esp8266 / Arduino

ESP8266 core for Arduino
GNU Lesser General Public License v2.1
15.98k stars 13.34k forks source link

Abort called if declaration of static std::function is moved to .cpp (optimized away?) #6032

Closed philbowles closed 5 years ago

philbowles commented 5 years ago

Basic Infos

Platform

Settings in IDE

Problem Description

Also reported @ https://github.com/arduino/Arduino/issues/8835 as I'm sure this is going to become a game of ping-pong and I don't know where the ownership lies...

Symptom is that working code fails (panic + stack dump) if definition of static std::function is moved to .cpp file. Most of the rest is beyond my pay grade but it seems maybe the variable has been "optimised away" when clearly it is in use!

MCVE Sketch

The Code:

Class .h file

#ifndef FUNCSPOOL_H
#define FUNCSPOOL_H

#include<Arduino.h>
#include <string>
#include <functional>

using namespace std;

class funcSpool{     
      function<void(string)>    fail;
  protected:
      static  function<void(string)>   staticFail;
  public:
            funcSpool(function<void(string)> f=staticFail): fail(f){} 
    virtual ~funcSpool(){}             
      void  run(const string s){ fail(s); }     
};
class fsPassthru: public funcSpool{};
#endif // funcspool_H

the external .cpp file

#include "funcSpool.h"
//When uncommented and thus declared in .cpp file, panic results
function<void(string)> funcSpool::staticFail  =[](string s){ Serial.printf("STATIC FUNCTOR %s\n",s.c_str()); };

The main .ino:

#include "funcSpool.h"
//When uncommented and thus declared HERE, all is correct
//function<void(string)> funcSpool::staticFail  =[](string s){ Serial.printf("STATIC FUNCTOR %s\n",s.c_str()); };

funcSpool*  gsp=new fsPassthru;

void setup() {
  Serial.begin(74880);
  string test="always works locally both ways";

  funcSpool*  lsp=new fsPassthru;
  Serial.printf("LOCAL POINTER %08x\n",(void*)lsp); 
  lsp->run(test);

  Serial.printf("GLOBAL POINTER %08x\n",(void*)gsp); 
  gsp->run(test);
  Serial.printf("EVERYTHING OK\n");
}

void loop(){}

Debug Messages - "Correct" run

From the "correct" run where funcSpool::staticFail is declared within the .ino by uncommenting Includes portions of the .map file

LOCAL POINTER 3ffef154
STATIC FUNCTOR always works locally both ways
GLOBAL POINTER 3ffeefc4
STATIC FUNCTOR always works locally both ways
EVERYTHING OK

Map file extracts - "Correct" run

.literal.startup._GLOBAL__sub_I__ZN9funcSpool10staticFailE
                0x00000000        0x0 C:\Users\phil\AppData\Local\Temp\arduino_build_318206\sketch\funcspool3.ino.cpp.o
 .literal.exit._GLOBAL__sub_D__ZN9funcSpool10staticFailE
                0x00000000        0x0 C:\Users\phil\AppData\Local\Temp\arduino_build_318206\sketch\funcspool3.ino.cpp.o
 ...
.text.startup._GLOBAL__sub_I__ZN9funcSpool10staticFailE
                0x40202650       0x40 C:\Users\phil\AppData\Local\Temp\arduino_build_318206\sketch\funcspool3.ino.cpp.o
                                 0x5c (size before relaxing)
 .text.exit._GLOBAL__sub_D__ZN9funcSpool10staticFailE
                0x40202690       0x15 C:\Users\phil\AppData\Local\Temp\arduino_build_318206\sketch\funcspool3.ino.cpp.o
                                 0x1d (size before relaxing)
 *fill*         0x402026a5        0x3 
...
 .bss._ZN9funcSpool10staticFailE
                0x3ffee30c       0x10 C:\Users\phil\AppData\Local\Temp\arduino_build_318206\sketch\funcspool3.ino.cpp.o
                0x3ffee30c                _ZN9funcSpool10staticFailE

Debug Messages - failing run

The .ino declaration is commented out and the .cpp declaration is uncommented:

LOCAL POINTER 3ffef144
STATIC FUNCTOR always works locally both ways
GLOBAL POINTER 3ffeefbc

Abort called

>>>stack>>>

ctx: cont
sp: 3ffffdb0 end: 3fffffc0 offset: 01b0
3fffff60:  3ffe85a6 4020a634 3ffeefbc 402025a2  
3fffff70:  3ffee7e8 3ffef120 40202488 3ffee390  
3fffff80:  3fffdad0 3ffee318 3ffee338 4020267f  
3fffff90:  3ffef120 3ffef120 feefeffe feefeffe  
3fffffa0:  feefeffe 00000000 3ffee360 40202cf4  
3fffffb0:  feefeffe feefeffe 3ffe84f4 401009a5  
<<<stack<<<

Decoding 6 results
0x4020a634: operator new(unsigned int) at /workdir/arena/gcc/xtensa-lx106-elf/libstdc++-v3-nox/libsupc++/../../../../../dl/gcc-xtensa/libstdc++-v3/libsupc++/new_op.cc line 52
0x402025a2: funcSpool::run(std::string) at c:\users\phil\appdata\local\arduino15\packages\esp8266\tools\xtensa-lx106-elf-gcc\2.5.0-3-20ed2b9\xtensa-lx106-elf\include\c++\4.8.2\bits/basic_string.h line 274
:  (inlined by) ?? at c:\users\phil\appdata\local\arduino15\packages\esp8266\tools\xtensa-lx106-elf-gcc\2.5.0-3-20ed2b9\xtensa-lx106-elf\include\c++\4.8.2\bits/basic_string.h line 510
:  (inlined by) ?? at c:\users\phil\appdata\local\arduino15\packages\esp8266\tools\xtensa-lx106-elf-gcc\2.5.0-3-20ed2b9\xtensa-lx106-elf\include\c++\4.8.2/functional line 2464
:  (inlined by) funcSpool::run(std::string) at C:\Users\phil\AppData\Local\Temp\arduino_build_318206\sketch/funcSpool.h line 17
0x40202488: std::_Function_base::_Base_manager ::_M_manager(std::_Any_data&, std::_Function_base::_Base_manager  const&, std::_Manager_operation) at c:\users\phil\appdata\local\arduino15\packages\esp8266\tools\xtensa-lx106-elf-gcc\2.5.0-3-20ed2b9\xtensa-lx106-elf\include\c++\4.8.2/functional line 1931
0x4020267f: setup at C:\Users\phil\Desktop\funcspool 3\funcspool3/funcspool3.ino line 16
0x40202cf4: loop_wrapper() at C:\Users\phil\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.5.0\cores\esp8266/core_esp8266_main.cpp line 122
0x401009a5: cont_wrapper at C:\Users\phil\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.5.0\cores\esp8266/cont.S line 81

Map file extracts - Failing run

I'm no expert, but "Discarded input sections" doesn't sound promising...

Discarded input sections

 .group         0x00000000       0x18 C:\Users\phil\AppData\Local\Temp\arduino_build_318206\sketch\funcSpool.cpp.o
 .literal._ZNSt17_Function_handlerIFvSsEN9funcSpoolUlSsE_EE9_M_invokeERKSt9_Any_dataSs
                0x00000000        0x0 C:\Users\phil\AppData\Local\Temp\arduino_build_318206\sketch\funcSpool.cpp.o
 .literal._ZNSt14_Function_base13_Base_managerIN9funcSpoolUlSsE_EE10_M_managerERSt9_Any_dataRKS4_St18_Manager_operation
                0x00000000        0x0 C:\Users\phil\AppData\Local\Temp\arduino_build_318206\sketch\funcSpool.cpp.o
 .literal.startup._GLOBAL__sub_I__ZN9funcSpool10staticFailE
                0x00000000        0x0 C:\Users\phil\AppData\Local\Temp\arduino_build_318206\sketch\funcSpool.cpp.o
 .literal.exit._GLOBAL__sub_D__ZN9funcSpool10staticFailE
                0x00000000        0x0 C:\Users\phil\AppData\Local\Temp\arduino_build_318206\sketch\funcSpool.cpp.o
...
 .text.startup._GLOBAL__sub_I__ZN9funcSpool10staticFailE
                0x402024c0       0x37 C:\Users\phil\AppData\Local\Temp\arduino_build_318206\sketch\funcSpool.cpp.o
                                 0x3b (size before relaxing)
 *fill*         0x402024f7        0x1 
 .text.exit._GLOBAL__sub_D__ZN9funcSpool10staticFailE
                0x402024f8       0x15 C:\Users\phil\AppData\Local\Temp\arduino_build_318206\sketch\funcSpool.cpp.o
                                 0x1d (size before relaxing)
 *fill*         0x4020250d        0x3 
...

 .bss._ZN9funcSpool10staticFailE
                0x3ffee308       0x10 C:\Users\phil\AppData\Local\Temp\arduino_build_318206\sketch\funcSpool.cpp.o
                0x3ffee308                _ZN9funcSpool10staticFailE
philbowles commented 5 years ago

I just re-ran it with exceptions enabled:

LOCAL POINTER 3ffefb3c
STATIC FUNCTOR always works locally both ways
GLOBAL POINTER 3ffef9b4

Unhandled C++ exception: bad_function_call

>>>stack>>>

ctx: cont
sp: 3ffffd50 end: 3fffffc0 offset: 01b0
3fffff00:  00000000 00000000 00000000 40204ad8  
3fffff10:  40000000 00000000 00000000 4020c250  
3fffff20:  00000000 00000000 00000000 00000000  
3fffff30:  3fffdad0 3ffee424 3ffefba4 4020c291  
3fffff40:  3ffefba4 4020c990 3ffefba4 4020ca07  
3fffff50:  6000001c 3ffee3c8 3ffef9b4 4020d136  
3fffff60:  3ffe85a6 3ffee3c8 3ffef9b4 402025e8  
3fffff70:  3ffee88c 3ffefb18 402024b0 3ffee424  
3fffff80:  3fffdad0 3ffee3c8 3ffef9b4 4020276e  
3fffff90:  3ffefb18 3ffefb18 3ffefb18 feefeffe  
3fffffa0:  feefeffe 00000000 3ffee3f0 40202f00  
3fffffb0:  feefeffe feefeffe 3ffe84f8 401009a5  
<<<stack<<<
Decoding 12 results
0x40204ad8: _Unwind_RaiseException at /workdir/arena/gcc/xtensa-lx106-elf/libgcc/../../../../dl/gcc-xtensa/libgcc/unwind.inc line 83
0x4020c250: __cxxabiv1::__terminate(void (*)()) at /workdir/arena/gcc/xtensa-lx106-elf/libstdc++-v3/libsupc++/../../../../../dl/gcc-xtensa/libstdc++-v3/libsupc++/eh_terminate.cc line 39
0x4020c291: std::terminate() at ?? line ?
0x4020c990: __gxx_exception_cleanup(_Unwind_Reason_Code, _Unwind_Exception*) at /workdir/arena/gcc/xtensa-lx106-elf/libstdc++-v3/libsupc++/../../../../../dl/gcc-xtensa/libstdc++-v3/libsupc++/eh_throw.cc line 33
0x4020ca07: __cxa_throw at ?? line ?
0x4020d136: std::__throw_bad_function_call() at ?? line ?
0x402025e8: funcSpool::run(std::string) at c:\users\phil\appdata\local\arduino15\packages\esp8266\tools\xtensa-lx106-elf-gcc\2.5.0-3-20ed2b9\xtensa-lx106-elf\include\c++\4.8.2\bits/basic_string.h line 274
:  (inlined by) ?? at c:\users\phil\appdata\local\arduino15\packages\esp8266\tools\xtensa-lx106-elf-gcc\2.5.0-3-20ed2b9\xtensa-lx106-elf\include\c++\4.8.2\bits/basic_string.h line 510
:  (inlined by) ?? at c:\users\phil\appdata\local\arduino15\packages\esp8266\tools\xtensa-lx106-elf-gcc\2.5.0-3-20ed2b9\xtensa-lx106-elf\include\c++\4.8.2/functional line 2464
:  (inlined by) funcSpool::run(std::string) at C:\Users\phil\AppData\Local\Temp\arduino_build_318206\sketch/funcSpool.h line 17
0x402024b0: std::_Function_base::_Base_manager ::_M_manager(std::_Any_data&, std::_Function_base::_Base_manager  const&, std::_Manager_operation) at c:\users\phil\appdata\local\arduino15\packages\esp8266\tools\xtensa-lx106-elf-gcc\2.5.0-3-20ed2b9\xtensa-lx106-elf\include\c++\4.8.2/functional line 1931
0x4020276e: setup at C:\Users\phil\Desktop\funcspool 3\funcspool3/funcspool3.ino line 16 (discriminator 1)
0x40202f00: loop_wrapper() at C:\Users\phil\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.5.0\cores\esp8266/core_esp8266_main.cpp line 122
0x401009a5: cont_wrapper at C:\Users\phil\AppData\Local\Arduino15\packages\esp8266\hardware\esp8266\2.5.0\cores\esp8266/cont.S line 81
earlephilhower commented 5 years ago

Thanks for being the first person to enable C++ exceptions in anger on the 8266. :)

It's also a bit above my pay grade, but my gut says this is not so much a header/cpp issue as a "when is the variable set by the runtime" issue.

The exception thrown is discussed here: https://en.cppreference.com/w/cpp/utility/functional/bad_function_call

If there's something going on where the startup routines which set BSS to 0 and copy RODATA to ram before initting the C core are calling a constructor before there is valid data somewhere, you can get something like this.

earlephilhower commented 5 years ago

Your code fails on Linux x86 G++ as well. I think you're doing something that's plain illegal in C++, but it's also above my pay grade...

earle@server:/tmp$ g++ --version
g++ (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

earle@server:/tmp$ cat a.cpp
#include <stdio.h>

#ifndef FUNCSPOOL_H
#define FUNCSPOOL_H

#include <string>
#include <functional>

using namespace std;

class funcSpool{     
      function<void(string)>    fail;
  protected:
      static  function<void(string)>   staticFail;
  public:
            funcSpool(function<void(string)> f=staticFail): fail(f){} 
    virtual ~funcSpool(){}             
      void  run(const string s){ fail(s); }     
};
class fsPassthru: public funcSpool{};
#endif // funcspool_H

funcSpool*  gsp=new fsPassthru;
function<void(string)> funcSpool::staticFail  =[](string s){printf("STATIC FUNCTOR %s\n",s.c_str()); };

int main() {
  string test="always works locally both ways";

  funcSpool*  lsp=new fsPassthru;
  printf("LOCAL POINTER %08x\n",(void*)lsp);
  lsp->run(test);

  printf("GLOBAL POINTER %08x\n",(void*)gsp);
  gsp->run(test);
  printf("EVERYTHING OK\n");
  return 0;
}
earle@server:/tmp$ g++ -Wall -o a a.cpp
a.cpp: In function ‘int main()’:
a.cpp:31:43: warning: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘void*’ [-Wformat=]
   printf("LOCAL POINTER %08x\n",(void*)lsp);
                                 ~~~~~~~~~~^
a.cpp:34:44: warning: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘void*’ [-Wformat=]
   printf("GLOBAL POINTER %08x\n",(void*)gsp);
                                  ~~~~~~~~~~^
earle@server:/tmp$ ./a
LOCAL POINTER 6d68bed0
STATIC FUNCTOR always works locally both ways
GLOBAL POINTER 6d68be70
terminate called after throwing an instance of 'std::bad_function_call'
  what():  bad_function_call
Aborted
devyte commented 5 years ago

But not above mine 😄 The explanation has to do with the order of initialization of global objects. The case shown by @earlephilhower shows it (luckily):

  1. The class constructors are called first, with no arg
  2. The constructor arg is assigned the default, i.e. the staticFail member
  3. The class member fail is then inited with the constructor arg
  4. The staticFail member is then constructed AFTER all of the above

That means that the class member fail received a function that in the best case has a nullptr as its target and in the worst case has garbage. Then when you call run(), the fail member is invoked, but since it has no target it throws.

This would work:

function<void(string)> funcSpool::staticFail  =[](string s){printf("STATIC FUNCTOR %s\n",s.c_str()); };
funcSpool*  gsp=new fsPassthru;

When using multiple TUs (translation units, as in multiple .cpp files that result in .o files that get linked together), the order of init is guaranteed by the order of the code used within one TU. However, the order in which the TUs are linked together is non-deterministic, so the init of the .ino globals could come first, or the init of the external.cpp globals could come first. Or it could be one order, then you add code and the order changes.

In the case of @philbowles "correct" run, the staticFail member is constructed globally within the .ino, which means that within the .ino TU its construction is guaranteed to happen before the class construction with new, because that is further down. Therefore, it works.

In the case of @philbowles "failing" run, the staticFail member is constructed globally within the .cpp, which means that it is in a different TU from where it is used: the .ino. Therefore, depending on the link order, if the TUs get linked in one order it will work, or in the inverse order it will throw exactly as happened.

Bottom line: YOU SHOULD NEVER EVER HAVE INIT OF ONE GLOBAL DEPEND ON ANOTHER GLOBAL IN A DIFFERENT TU. EVER.

Now, side notes:

  1. For the love of my sanity, and probably most C++ programmers out there, don't import the entire std namespace into the global namespace. Instead, import what you need into the respective class, something like this:
    class funcSpool{
    using string = std::string;
    using functional = std::functional;
    function<void(string)>    fail;
    ...
  2. Your setup() is using new on a ptr declared locally, but then it's never deleted => mem leak
  3. Polymorphism should be avoided on such small platforms like the ESP, because it generates overhead on the scarcest of all resources: heap. Most likely you can use std::function directly, or a template method or template objects. Zero-overhead, or as close as you can get to it, is important here.
philbowles commented 5 years ago

Thanks guys. In response, @devyte 2. I'm aware of the theoretical mem leak. its an MCVE, its not how my real code is wrtten. It happens once, cannot repeat and had an appropriate liftime for its puprpose. i know you are trying to prevent bad habits but you are pushing at an open door on this one...:) My code is already big and complex. It relies absolutely on polymorphism, but again, I understand your point. I spend a lot of time guarding the heap. Finally, how would you best advise "healthy" code mangement of multiple TU? I am trying to keep classes in separate TU. Im not sure this even happens in my real code...i wrote a quick test harness for this new class, and having finalised functionality in a monolithic ino i refactored into its own cpp/h then spotted this. I shall re-examine my initialisation strategy.

devyte commented 5 years ago

You implement the classes in their own files, that is good practice. Don't change that. In this case, the easiest fix would be to create your object with new in setup() instead of as a global call. Not very good practice, though. Your design problem is that the construction of your object assumes that the arg default is already available. Me, I wouldn't use the static member, and I wouldn't use the default arg thing. Instead, I would use two constructor overloads, one with arg and one without arg. One inits the fail member from the arg, the other with your lambda.

matthijskooijman commented 5 years ago

One thing that I believe could also help in some cases (I think, though not here)is to make your staticFail member constexpr. AFAIU, consexpr variables are initialized first (since their value is simply calculated at compiletime).

However, I'm pretty sure std::function does not have (and probably can never have) a constexpr constructor. Also, lambdas cannot but used as part of constexpr initialization until C++17. So this probably does not help this particular case, but if you would replace your std::function with a regular function pointer, I think this could solve your ordering problem (but then you probably loose other functionality...)