OLIMEX / ESP32-EVB

ESP32 WiFi / BLE Development board with Ethernet interface, Relays, microSD card
Apache License 2.0
248 stars 107 forks source link

External power supply via barrel connector not stable #57

Closed fredlcore closed 3 months ago

fredlcore commented 5 months ago

I'm maintaining a hardware project (https://www.bsb-lan.de) where probably half of the installations run on an Olimex (either EVB or POE). About a dozen users using the official Olimex power supply have now reported back to me that when they use WiFi on the Olimex, the system becomes unstable which shows in disconnects or slow WiFi connections. The matter immediately changes when powering the Olimex via a stable USB power supply (a laptop's USB for example) and the issues are gone.

I'm not sure whether this is a (external) power supply or (internal) voltage shifting issue. At least It seems that it's not the ampere rating of the power supply because similar problems have also occurred with cheap USB power supplies which rate 2-3A. With these, it seems to be rather the case that the voltage level is not stable enough.

We will now add a recommendation to only power the Olimex via a reliable USB power supply, but since an external power supply is often a good choice, I wanted to let you know about this problem, so that hopefully you can fix it in the next hardware revision.

DanKoloff commented 5 months ago

Can you clarify which is "official Olimex power supply"? We have at least 5 different 5V DC adapters, with currency rating ranging from 1A to 4A. Maybe the power jacks got worn out and no longer make good contact, it happens here.

Alternatively, maybe those boards had their power jack circuit damaged somehow and now only the USB circuit works fine.

fredlcore commented 5 months ago

Sorry for not being more specific, since I first reported this issue several months ago on a more general basis (https://github.com/OLIMEX/ESP32-EVB/issues/54), our project recommends the SY1005E. But I would have to check if these users actually have this or another power supply. As for a damaged jack, I think this is rather not likely, because people usually buy these boards anew just for this project. If the jack was broken, it should not power at all, but the fact that it runs unstable seems to indicate to me that the voltage is not provided in a stable way. We have also experienced this with cheap USB power supplies, that's why we tell users to check if the problem still persists when they power the board via USB from their laptop. So far, this has always resulted in a stable performance.

DanKoloff commented 5 months ago

We've manufactured a lot of ESP32 boards and variants and we've sold a lot of these boards to a lot of customers, truth to be told this is the first time we hear that the board's WIFI gets disconnected because of shaky power supply. I think it might be something related to your specific hardware or software implementation. But we lack details to pinpoint where exactly.

Do all users have the BSB header board attached to them? Can you give us some info about that board? What is expected current draw, which GPIOs it uses, can it affect bootstrap pins, etc?

How do you apply SY1005E to the ESP32-POE (it has no power jack)? Can you also specify exactly how many complaints are for ESP32-POE and how many for ESP32-EVB (or maybe other ESP32 boards like ESP32-GATEWAY or ESP32-POE-ISO) since the designs are quite different.

fredlcore commented 5 months ago

Of course I'm not ruling out that this might be a case where correlation is not causation, and I'm perfectly fine if you decide not pursue this any further, but if this matter comes up again, I'm of course ready to assist in any way I can. You guys make awesome boards of great quality, which is why I have mainly chosen the EVB for my project and recommend people to buy directly from you because of your outstanding service.

The BSB board itself is not much more than a level shifter using two optocouplers for sending and receiving data. It uses two GPIOs (17 and 36) for serial communication plus 3V3 and GND. Other than that (and a few resistors) there is nothing that draws power. Also, the issues do not occur when using Ethernet, only when WiFi is activated. So we assumed that as power consumption increases, the problems occured, and this was reproducible both with "weak" USB power supplies as well as power supplies connected to the barrel connector.

What we seem to have figured out is that it's probably the lack of a stable voltage rather than too few amps that cause the problem, which is why connecting the EVB to the USB port of a laptop always runs stable and without any problems. This got me to the conclusion that it is not software-related or related to any of the hardware we're using, but related to the power supply.

Regarding the POE, there were a few users who reported that using POE resulted in the same kind of problems we experienced with the barrel-jack power supplies and the "weak" USB power supplies. I raised this as a separate issue because it is not related to the barrel-jack, but still shows the same kind of effects. I'm sorry for copy-pasting the bug description from here without making the relevant changes.

All in all, I would say that in total maybe a dozen users have described these issues using the barrel jack. However, most of these did not necessarily use your SY1005E power supply which we recommend since December last year. Now that I've had another one who is using your power supply, and another one who has these problems using POE, my lay-person guess would be that some kind of power conversion on the board could be the issues that does not affect the USB port. But this is just me guessing.

For us it's in the end not a problem, since we can just recommend people to use a high quality USB power supply, and then there is no issue anymore (as seems to be the case with the vast majority of users). I was just thinking that when you do your next board revision, it might be something to having a new look at these power supply lines.

In any case thank you for your time and keep up the great work.

TsvetanUsunov commented 5 months ago

Is it possible to send us "non working" setup to us for analysis? So we can reproduce this situation at our lab with all attached additional components?

fredlcore commented 5 months ago

I'll check with the users that are having this problem and will get back to you.

horendus commented 4 months ago

Hi, just thought I would chime in on this one, I have developed a product that runs on the esp32-evb board and have also found instability from a few customers when using the barrel jack, im using ethernet as the primary connection method.

My logging shows this as random unexplained reboots that go away when using quality USB connection.

fredlcore commented 4 months ago

Yes, random reboots were also reported to me, going away when switching to USB power.

TsvetanUsunov commented 4 months ago

looks to me like problematic power supply cable

fredlcore commented 4 months ago

I'm sorry, but the users that had reported the instability have not responded to my e-mails whether they could provide you with the Olimex devices and power supplies that they are using. If you are convinced that it cannot be a hardware issue, then that's fine with me, even though I don't know how on what you are basing your assumption that it's a problematic power supply cable.

@horendus: Could you try it out with a different barrel jack power supply? If the instability can be reproduced by using different power supplies, I think it would make a strong(er) case that the issue is not (entirely) related to problematic power supplies.

TsvetanUsunov commented 4 months ago

I would measure +5V rail on the board on the anode of PWRLED1 or the RELAY coil when powered with USB and barrel power jack and see if there is difference.

fredlcore commented 4 months ago

@horendus: Is this something you could check?

osro commented 3 months ago

I can confirm this issue. I recently purchased an Olimex ESP32 EVB with an included power supply. When using the Olimex power supply, the board connects to the network but remains unresponsive. However, when I switch to a USB power supply, everything functions properly.

fredlcore commented 3 months ago

Thanks for the feedback, @osro, since @TsvetanUsunov asked above whether one of the users of my project could send him a faulty device, maybe the two of you could work something out?

DanKoloff commented 3 months ago

I can confirm this issue. I recently purchased an Olimex ESP32 EVB with an included power supply. When using the Olimex power supply, the board connects to the network but remains unresponsive. However, when I switch to a USB power supply, everything functions properly.

I can test empirically here with few ESP32-EVB boards from the shop and Olimex power supplies. If we can replicate the problem here, we can work to resolve it. I have few questions to be certain we are using the same hardware:

  1. Which Olimex power supply exactly?
  2. Is it the base ESP32-EVB board (not one of the variants)?
  3. What is the hardware revision printed on the board?
  4. If possible also share the code that you use. If it propriety maybe simple it up (but confirm issue remains with new code).
osro commented 3 months ago
  1. This is the one that Olimex.com suggested to me: image
  2. Yes, ESP32-EVB with BSB header board.
  3. K1
  4. And the software that I'm using is the BSB-LAN by @fredlcore
fredlcore commented 3 months ago

Thanks, @osro!
Installation of my code is easy: Download the latest zip from the master repository, rename two files (BSB_LAN_custom_defs.h.default to BSB_LAN_custom_defs.h as well as BSB_LAN_config.h.default to BSB_LAN_config.h) and enter the WiFi credentials in BSB_LAN_config.h (and set #define LANG to EN if you don't speak German ;) ) . The sketch should compile and a webinterface should come up at the IP address. The adapter that my project provides is not necessary to run the software (except for a few errors it will print on serial monitor).

osro commented 3 months ago

The USB power supply with which it works properly outputs 5V, 2.5A.

fredlcore commented 3 months ago

One thing comes to my mind now that I see the specs of the power supply: We had similar problems with USB-based power supplies which had enough amps, but eventually, the output voltage was slightly below 5V. Is there some kind of voltage regulation taking place that may result in less than 5V reaching the ESP32 when using the barrel jack? Or is the recommended power supply maybe not providing stable 5V in the first place?

DanKoloff commented 3 months ago

Alright I will need some time to test it. Not very familiar with the software.

Well, it can be anything, these are good power supplies but who knows. I have more powerful ones so if the issue is in the adapter it will be easy to figure out with empirical testing by simply using more powerful ones (with bigger wattage). We have this 3A one: https://www.olimex.com/Products/Power/SY1505E/ and I also have multiple variable PSUs. The biggest goal ahead of me is to have same failure as you, once I have the same problem we can find why it happens.

DanKoloff commented 3 months ago

@fredlcore Do I have to set uint8_t network_type = LAN; to WLAN for WIFI? Can I use DNS setting?

fredlcore commented 3 months ago

Yes, please set it to WLAN, and you can configure all network settings as you like, just set useDHCP to off then.

fredlcore commented 3 months ago

As for the more powerful PSUs: We had one user using a powerful PSU (at least based on ratings) via USB, but network was still shaky. Measuring the output voltage showed slightly below 5V (4.8/4.9V) and these seemed to have been the reason for the instability.

DanKoloff commented 3 months ago

What is the expected maximum current draw of the whole system ESP32 board + your shield? Or at least guess or approximation?

@osro Do you have the BSB shield on when the issue appears and does just ESP32-EVB experiences the issues too?

DanKoloff commented 3 months ago

Alright I managed to download the software, I see the web-page. Everything seems to work fine. Now I have to make it stop. Might be harder without your shield. I think if it is related to power draw, without your shield the board will draw not enough to hang. But will see.

fredlcore commented 3 months ago

This is the schematics of the adapter board: bsb_adapter.pdf It might look a bit confusing at first because it includes both connections for the Arduino Due (which needs the EEPROM at the bottom) as well as the ESP32, but for the ESP32/Olimex the only lines used are 08/TX1, 10/RX1 as well as 3V3 and GND. RX1 and TX1 both go to one of the two optocouplers which transmit the impulses of the heating system bus. I haven't calculated the amount of energy that each pin draws, but as you can see from the resistors, it is really negligible compared to the consumption of other kinds of devices.

Again, I don't think its a power draw issue, but a matter of enough current and/or voltage stability. Why would the problem go away when switiching from barrel connector to USB if it was related to my board? And what about the observation of @horendus who is not a BSB-LAN user?

DanKoloff commented 3 months ago

Good news - it also hangs here in the same scenario - where I use the PSU on the power jack. Same external PSU as in the picture above (Sunny PSU 5V * 1.2A) works fine if I apply it to the USB via custom made cable (power jack - micro USB adapter cable). It is likely either something in differences between power jack power circuit and USB power circuit, or there is slight chance it can be something in the code.

Now I will check with few different power supplies and voltages, do some measurements and then will try with different code.

fredlcore commented 3 months ago

That's great news, thanks for making this effort! Not sure if the code has much to do with it, because why should the code be affected whether the board is powered via USB vis-a-vis the barrel connector, but this is what you can try to reduce the code to a more minimal version:

That should leave you with a skimmed-down loop() function that would only print the main website when you're calling the IP address. It might still do the version check and thus call our external server unless you set that variable to false in the config file.

fredlcore commented 3 months ago

Do I read the schematics of the EVB right in the sense that only power from the barrel connector goes through the 1N5822 Schottky diode? This one has a forward voltage drop of 0.3V to 0.525V according to the specs. That would correlate with the same behavior observed when using a USB power supply of lower quality that does not provide stable 5V output.

DanKoloff commented 3 months ago

Yeah diode is a difference between power jack and USB and has some voltage drop but I am not sure if that is the root of the problem.

Now I tested with some different software and it doesn't hang. I also ping the IP address to see if there are disconnects but there are no disconnects. Same ESP32-EVB connected a LED to GPIO4 at the UEXT and used this SimpleWiFiServer code (just changed GPIO to GPIO4 since that is where I've attached the LED):

/* WiFi Web Server LED Blink

A simple web server that lets you blink an LED via the web. This sketch will print the IP address of your WiFi Shield (once connected) to the Serial monitor. From there, you can open that address in a web browser to turn on and off the LED on pin 5.

If the IP address of your shield is yourAddress: http://yourAddress/H turns the LED on http://yourAddress/L turns it off

This example is written for a network using WPA2 encryption. For insecure WEP or WPA, change the Wifi.begin() call and use Wifi.setMinSecurity() accordingly.

Circuit:

  • WiFi shield attached
  • LED attached to pin 5

    created for arduino 25 Nov 2012 by Tom Igoe

ported for sparkfun esp32 31.01.2017 by Jan Hendrik Berlin

*/

include

const char ssid = "Olimex"; const char password = "Hardware";

NetworkServer server(80);

void setup() { Serial.begin(115200); pinMode(4, OUTPUT); // set the LED pin mode

delay(10);

// We start by connecting to a WiFi network

Serial.println(); Serial.println(); Serial.print("Connecting to "); Serial.println(ssid);

WiFi.begin(ssid, password);

while (WiFi.status() != WL_CONNECTED) { delay(500); Serial.print("."); }

Serial.println(""); Serial.println("WiFi connected."); Serial.println("IP address: "); Serial.println(WiFi.localIP());

server.begin(); }

void loop() { NetworkClient client = server.accept(); // listen for incoming clients

if (client) { // if you get a client, Serial.println("New Client."); // print a message out the serial port String currentLine = ""; // make a String to hold incoming data from the client while (client.connected()) { // loop while the client's connected if (client.available()) { // if there's bytes to read from the client, char c = client.read(); // read a byte, then Serial.write(c); // print it out the serial monitor if (c == '\n') { // if the byte is a newline character

      // if the current line is blank, you got two newline characters in a row.
      // that's the end of the client HTTP request, so send a response:
      if (currentLine.length() == 0) {
        // HTTP headers always start with a response code (e.g. HTTP/1.1 200 OK)
        // and a content-type so the client knows what's coming, then a blank line:
        client.println("HTTP/1.1 200 OK");
        client.println("Content-type:text/html");
        client.println();

        // the content of the HTTP response follows the header:
        client.print("Click <a href=\"/H\">here</a> to turn the LED on pin 4 on.<br>");
        client.print("Click <a href=\"/L\">here</a> to turn the LED on pin 4 off.<br>");

        // The HTTP response ends with another blank line:
        client.println();
        // break out of the while loop:
        break;
      } else {  // if you got a newline, then clear currentLine:
        currentLine = "";
      }
    } else if (c != '\r') {  // if you got anything else but a carriage return character,
      currentLine += c;      // add it to the end of the currentLine
    }

    // Check to see if the client request was "GET /H" or "GET /L":
    if (currentLine.endsWith("GET /H")) {
      digitalWrite(4, HIGH);  // GET /H turns the LED on
    }
    if (currentLine.endsWith("GET /L")) {
      digitalWrite(4, LOW);  // GET /L turns the LED off
    }
  }
}
// close the connection:
client.stop();
Serial.println("Client Disconnected.");

} } />

I will continue tomorrow.

DanKoloff commented 3 months ago

Do I read the schematics of the EVB right in the sense that only power from the barrel connector goes through the 1N5822 Schottky diode? This one has a forward voltage drop of 0.3V to 0.525V according to the specs. That would correlate with the same behavior observed when using a USB power supply of lower quality that does not provide stable 5V output.

Thing is that doesn't matter for the ESP32 operation (and it is either the WIFI or the ESP32 that hangs). The ESP32 chip is not powered by 5V and the second regulator that makes 3.3V from 5V makes the same output no matter if the input is 5V or 4.75. The 3.3V output seems the same no matter how you power the board (I've measured the 3.3V at the EXT1 header, it is around 3.26 no matter if I used USB, power jack, or Li-Po battery).

What is more interesting is that your code doesn't hang while operating only on Li-Po battery either (when there is no 5V anyway). Also this means it is unlikely related to USB/serial code or DTR/DTS.

Yet, the moment I insert 5V at the power jack it starts spewing "Request timed out". No matter how powerful supply I use. This is very bizarre.

I will test with some more web-server demos from other sources.

fredlcore commented 3 months ago

Ok, that's interesting, thanks for this kind of in-depth testing - if the regulator makes 3.3V from the 5V no matter how much above 3.3V it is, then of course that's not an issue, just wanted to rule out the only difference I could spot in the schematics.

fredlcore commented 3 months ago

Does it also happen if you plug in the barrel PSU while at the same time hooking up the LiPo? If yes, then maybe it could be some interference of some sort and be more related to the way the traces run on the board? But maybe that's too wild a guess...

DanKoloff commented 3 months ago

I think we figured it out.

It is a combination of things that is causing the issue but mainly serial-USB adapter chip CH340T remains partially powered (not properly powered nor unpowered enough) and your code can't catch that scenario and hangs (update - it appears something forces a software reset, so it is probably not a hang but forced software reset). There is something in the way serial messages are handled in your code that causes the issue. Incomplete and improper messages are sent over the serial and this causes the hang. Your code doesn't expect such messages and doesn't handle them properly. Probably you have to discard incomplete messages. Are you waiting for some serial input? If you do this would also explain why other web-server demos with serial code don't hang (more specifically "SimpleWiFiServer" and "HelloServer" from default ESP32 examples) - I believe they don't wait for anything over the serial just print.

Why incomplete or improper messages are sent - because when the board is powered by the power jack there seems to be some reverse powering from GPIO3 and USB-serial adapter CH340T has ~1V on it, which is just enough for it to cause problems. It is much lower when powered from battery so it is not a problem. It is fully powered when powered from USB or 5V/GND pins, so there is no issue.

How can I disable all serial communication in your code to test this theory?

Aside from that is there somewhere in your code GPIO3 internal pull up enabled?

Until you figure it out my advice is all users of your software and ESP32-EVB to avoid using the power jack to power the board. All other power methods are alright. This should affect only ESP32-EVB, since other Olimex-made ESP32 boards have no power jack.

fredlcore commented 3 months ago

Great news, thanks! The incoming serial message handling takes place in function GetMessage in file src/BSB/bsb.cpp. The function is called in line 4889 (if (bus->GetMessage(msg) || busmsg == true) { // message was syntactically correct), to disable it, just remove the whole block until line 4916 (} // endelse, NOT in monitor mode). In bsb.cpp's GetMessage function, there is a while (serial->available() > 0) { that would obviously be active as long as there is any kind of data coming in. If there is data available, then readByte function is called which does some on-the-fly inverting:

uint8_t BSB::readByte() {
  byte read = serial->read();
  if (bus_type != 2) {
    read = read ^ 0xFF;
  }
  return read;
}

Immediately after the byte is read, it is checked for some magic bytes to detect if it's the start of a new message. If this is not the case, GetMessage returns false and there is no further serial processing taking place until the next loop iteration. If a magic byte is detected, then it reads a maximum if 32 bytes. It waits a few milliseconds here and there to test if there is another byte coming, but if not, then the while loop evaluates false and the function returns false. Even if the invalid message would contain a stream of magic bytes, after 32 bytes, the function would exit with false and the ESP32 would attend to any URL request at least for a short while. Also, if the message is not correctly transmitted, it should print debug messages on the serial port, such as Length error or CRC error.

As for the GPIOs, I only use 36 and 17 for RX/TX (can be changed in BSB_LAN_config.h), and GPIO 34 as an input pin (when connected to 3V3, the EEPROM gets erased).

In case we don't find the reason for the problem, could something like this be fixed in hardware in one of the next revisions?

fredlcore commented 3 months ago

Ah, the software reset would be very helpful, it should be a watchdog trigger - could you by any chance use something like esp32_exception_decoder to figure out what causes it? I checked the code in GetMessage, but maybe I overlooked something that would cause one of the while loops not to exit before the watchdog kicks in?

DanKoloff commented 3 months ago

I believe it is not watchdog reset. Watchdog reset sends different info over the serial. This seems like software reset:

rst:0xc (SW_CPU_RESET),boot:0x1b (SPI_FAST_FLASH_BOOT) configsip: 0, SPIWP:0xee clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00 mode:DIO, clock div:1 load:0x3fff0030,len:1288 load:0x40078000,len:13872 load:0x40080400,len:4 ho 8 tail 4 room 4 load:0x40080404,len:3048 entry 0x40080590 READY Reading EEPROM

In case we don't find the reason for the problem, could something like this be fixed in hardware in one of the next revisions?

No. It would require extra components and partial reroute.

fredlcore commented 3 months ago

Ok, could it be that BSB-LAN cannot connect to the config-defined WiFi? Because it then sets up its own access point "BSB-LAN" and then reboots after 30 minutes to try to reconnect to the defined network. To rule this one out, could you just add a debug message right after line 4665 (void resetBoard() {) to rule out that it comes from this function after the 30 minute timeout?

fredlcore commented 3 months ago

is there something I could/should do with GPIO3?

osro commented 3 months ago

In my experience, the BSB-LAN web interface does not respond at all when powered via the barrel jack. Although my WiFi router assigns it an IP address and indicates that it is online, the BSB-LAN web interface remains inaccessible. So there is no delays.

Conversely, when using a USB power supply, the BSB-LAN interface is operational within a few seconds after booting.

fredlcore commented 3 months ago

@osro: Could you post a log file (or send me via e-mail) that shows that behaviour? If it gets an I address, it needs to work somehow, i.e. it shouldn't be unresponsive, at least on the serial monitor log..

fredlcore commented 3 months ago

@DanKoloff Could you check please that the device does not hang when you remove lines 4889 until 4916 as mentioned above? If it does not hang then, then we know it's the serial routine. Otherwise the problem must lie somewhere else...

DanKoloff commented 3 months ago

@DanKoloff Could you check please that the device does not hang when you remove lines 4889 until 4916 as mentioned above? If it does not hang then, then we know it's the serial routine. Otherwise the problem must lie somewhere else...

Sure, testing right now.

DanKoloff commented 3 months ago

@DanKoloff Could you check please that the device does not hang when you remove lines 4889 until 4916 as mentioned above? If it does not hang then, then we know it's the serial routine. Otherwise the problem must lie somewhere else...

No change, same behavior. Doesn't work well when powered by power jack, if I attach USB starts working fine.

fredlcore commented 3 months ago

Ok, then it can't be related to the serial port. Could you remove the parts from the loop function that I mentioned above https://github.com/OLIMEX/ESP32-EVB/issues/57#issuecomment-2160055643 ?

And does "doesn't work well" mean that the "Ping!" messages still come up every minute?

DanKoloff commented 3 months ago

Testing right now with those bits commented out, it might take a few tries to compile succesfully. My machine is not very fast and it takes like 15 minutes to verify and upload.

Doesn't work well means that sometimes for some small period it works, but generally starts to bug especially when you interact with the web-page. As soon as you click one or two menu items in the web-page the connection shows dropped packets and I believe the board performs software resets. After some time it might recover but as soon as you start interacting with the web-page it will die for sure again.

fredlcore commented 3 months ago

You can turn off verification in the Arduino IDE settings. That one is really a pain in the back...

DanKoloff commented 3 months ago

@fredlcore Can you just send me edited ino file to replace in my project? I have hard time commenting out those things and compiling successfully, especially the last excerpt that end in } - maybe even few versions of the ino to test, just remember what you did in what version of the ino.

Meanwhile one of our software guys is also trying few things with your code and trying to debug it.

fredlcore commented 3 months ago

Ok, will do...

DanKoloff commented 3 months ago

So we found why this hang happens. We tried a workaround to prove it and now it works without hanging.

In BSB_LAN\src\BSB\bsb.cpp in many locations there is this construction:

    while (serial->available()) {
        ...
        some_var = readByte();
        ...
    }

becomes an endless cycle because CH340T is not completly unpowered and sends some data. This leads to stack crashing with:

"Stack smashing protect failure"

So we changed this construction to include timeout check.

    unsigned long timeout = millis();
    while (serial->available() > 0) {
        if (millis() - timeout > 300) {
            break;
        }
        ...
        some_var = readByte();
        ...
    }

This makes the software run without issues and no crashes.

Notice that this was done just for debugging purposes (we are not aware if 300ms is sufficent or any other specifics). It was done just to test our theory, so you'd have to check what happens after break and what should happen. So for different cases there should be proper timeouts and timout handling (what happens when timout occurs).

I also attach the ugly fixed version (with some other ugly debug messages that we used), hope it helps. Changed cpp to txt since cpp is not allowed by GitHub.

bsb.txt