platformio / platform-espressif8266

Espressif 8266: development platform for PlatformIO
https://registry.platformio.org/platforms/platformio/espressif8266
Apache License 2.0
320 stars 218 forks source link

Builds made with PIO 4 sometimes show issues with WiFi #166

Open TD-er opened 4 years ago

TD-er commented 4 years ago

As already mentioned in this thread: https://github.com/esp8266/Arduino/issues/6172#issuecomment-513408516 At the moment it is a Russian Roulette whether or not the node is capable of connecting to WiFi.

Only changing something totally unrelated (adding some plugins in ESPeasy build for example, or changing a debug string) may lead to completely unusable nodes. They need sometimes over 100 crash-reboots to get connected to WiFi. Building another configuration and you may be lucky to have it working.

I did overcome my "dislike" of Arduino IDE for these tests and tried a lot of builds in Arduino IDE and they all work just fine.

This has happened before (around March - May 2018) and the solution then appeared to build everything twice. But now even that doesn't matter, meaning the issue is rather deterministic but really hard to reproduce using a very small test program. So I've spent the last week lots and lots of hours trying to make sense of this all, but I only came to the conclusion this is something weird in PlatformIO and it looks like it may have started (or became worse) since PIO 4.0

These non working builds show various forms of non-operation:

And some others. This all has the 'smell' of linker issues, maybe combined with data corruption somehow. So this may still be a programming error in my code, or Arduino core libraries, which manifest themselves by something PIO does different compared to Arduino IDE.

TD-er commented 4 years ago

Hmm, I have been testing a lot and now I do have some relatively small build (still too much to post here) which also refuses to connect to WiFi at the first attempt, but is built with Arduino IDE.

It looks like there is something wrong which does manifest itself more often using PIO than Arduino IDE.

Just to be sure, since it does often result in WiFi connect issues.

ivankravets commented 4 years ago

The major change in PIO Core 4.0 is "build_dir". Now, it's located in .pio/build. If you use old dev/platform, see https://github.com/platformio/platform-espressif8266/releases/tag/v1.5.0

pfeerick commented 4 years ago

@TD-er Since esp8266/Arduino/issues/6172 got closed... did this get resolved?

TD-er commented 4 years ago

@pfeerick I think it can be closed now. The main things that may have caused build issues in the past:

umitech commented 4 years ago

@TD-er I have exactly the same problem here. Can you say what you found in your code which was helpful here? I'm opening this issue again because even after updating to pio-esp8266-2.3.2 today, it still exists.

I have this problem for a while. It gets solved after some changes in my code or changing the version of an Arduino lib or changing the pio esp8266 platform version (1.8.0 - 2.2.3) but it returns after one or some changes in another part of the project. I did lots of things offered by the community with no complete success. Sometimes even adding a simple integer variable can turn the problem back and commenting that line can solve the problem. It remembers me of some kind of memory conflict or linking problems or st. I couldn't find anything in my code after lots of digging.

Using the Lwip 1.4 makes the situation better and reduces the occurrence of this problem, but still doesn't solve the problem completely.

The project is too bigger than I can provide it here but for more clarification, I can say that the main libs I'm using are ArduinoJson, AsyncMqttClient and ESPAsyncTCP

TD-er commented 4 years ago

We also use ArduinoJSON, but not the Async libraries you mentioned.

One of the big changes I made in my project is to move a lot of .ino stuff into .h/.cpp files. Maybe this does make the build process more predictable in the order in which parts of the code gets compiled which makes it very hard to reproduce a "bad build" in my code.

umitech commented 4 years ago

Here all the codes are in .h/.cpp files. We just have a main.ino file which has an instance of a manager class. So I don't think that's the problem here. I hope we can find this problem's source to end this nightmare forever.

ascillato commented 4 years ago

Another thing to take into account is not using empty brakes when doing a function definition like

Byte myfunction()

It is recommended and better for the compiler to follow the standard as:

Byte myfunction(void)

ascillato commented 4 years ago

When compiling Tasmota, I had those issues but after we made that silly change, I never had those wifi issues again.

TD-er commented 4 years ago

Another thing to take into account is not using empty brakes when doing a function definition like

Byte myfunction()

It is recommended and better for the compiler to follow the standard as:

Byte myfunction(void)

Why is it recommended to do this (in C++) ? Do you have a link with more info on this?

umitech commented 4 years ago

As my memory says from old time lessons, that's not a problem in C++. The unspecified argument list is an issue with standard C not with C++. Besides that, if it was a problem you had to change not only your written code but a lot of the libraries and even the Arduino core

ascillato commented 4 years ago

One thing is the recommendations and another is what the compiler does. Please, try that in your code + libraries to see if solves your wifi issues. It is just a search and replace in PIO. Very fast and easy to perform.

TD-er commented 4 years ago

Well I am not sure how fast this replace will be, since you want to void in the function declaration and not in the function call.

TD-er commented 4 years ago

OK, I've read a bit about this and apparently in C there is a difference between void functionname(); and void functionname(void); One of the more elaborate explanations I found: StackOverflow - Is it better to use C void arguments “void foo(void)” or not “void foo()”?

So wherever we use something wrapped in extern "C" {...} it could be interpreted different when not explicit mentioning void in the function declaration.

@ascillato What did you change? Only your own code, or also the libraries, or also the ESP Arduino library?

ascillato commented 4 years ago

Well I am not sure how fast this replace will be

What we did:

After 1 second, it is replaced in all code (Tasmota) and its libraries.

After this innocent change, there were no more wifi issues in the code.

Compiling under arduino IDE, this problem is not there. But under platformio it is.

TD-er commented 4 years ago

After 1 second, it is replaced in all code (Tasmota) and its libraries.

I get that, but then you also have explicitly added (void) to all function calls. Does that have any effect?

ascillato commented 4 years ago

No effect (no more or less code), no side effect (like functions not working). Just the correct align in the compiled binary.

ascillato commented 4 years ago

@umitech

Please, give it a try and tell us if this solves your issue.

TD-er commented 4 years ago

By the way, don't forget to use a file filter, like this: image

TD-er commented 4 years ago

I just let Travis do a test build and it does matter if you use (void) in the function call itself. For example c_str() does really not like to be called with c_str(void) And that's just a single example.

ascillato commented 4 years ago

For example c_str() does really not like to be called with c_str(void) And that's just a single example.

so, it is not going to be a fast change :/

TD-er commented 4 years ago

Nope, since my code base apparently has over 10'000 changes of () into (void).

Isn't there a compiler flag to let the compile find all instances of the code where this may be an issue?

ascillato commented 4 years ago

I don't know. It should be due to Arduino IDE can compile it correctly with or without (void)

umitech commented 4 years ago

It's a painful change as we have lots of internal and external libs with thousands of () in definition, declarations, and references in our project. But If we really want to do this test, we should change lots of ESP8266 core-libs as well. It includes WiFi, Http, FS and so on as there you can find lots of empty declarations too, otherwise this test will not be useful.

Another thing is that you said that your problem didn't get solved in platformio. Every time I thought this problem is solved by compiling in Arduino IDE, the state changed after a while of code changing. But I can confirm that it seems that we see this problem less when we compile with Arduino IDE (which is itself strange enough).