esp8266 / Arduino

ESP8266 core for Arduino
GNU Lesser General Public License v2.1
16.02k stars 13.33k forks source link

When using both Wi-Fi and Serial, Serial TX drops characters #9167

Closed Jookia closed 2 months ago

Jookia commented 3 months ago

Basic Infos

Note: There are no instructions for building and using the latest git with arduino-cli so I couldn't do it.

Platform

Settings in IDE

arduino-cli compile --fqbn esp8266:esp8266:generic uart-to-wifi.ino
arduino-cli upload --fqbn esp8266:esp8266:generic --port /dev/ttyUSB0

Problem Description

I'm currently working on a Wi-Fi to UART sketch, and during testing I found that while Serial to Serial, Wi-Fi to Wi-Fi, and Serial to Wi-Fi work, the case of Wi-Fi to Serial drops characters. This only happens under very high load.

MCVE Sketch

#include <ESP8266WiFi.h>
#include <WiFiClient.h>
#include <WiFiServer.h>

const char* ssid = "ssid";
const char* password = "password";
WiFiServer server(23);
WiFiClient client;
char buffer[4096];

void setup() {
  WiFi.mode(WIFI_STA);
  WiFi.begin(ssid, password);
  if (WiFi.waitForConnectResult() != WL_CONNECTED) {
    delay(5000);
    ESP.restart();
  }
  Serial.begin(115200);
  server.begin();
}

void loop() {
  if (server.hasClient()) {
    WiFiClient newClient = server.available();
    if (client.connected()) {
      newClient.abort();
    } else {
      client.stop();
      client = newClient;
    }
  }
  if (client.available()) {
    int bytesRead = client.readBytes(buffer, min(client.available(), 4096));
    if (bytesRead > 0) {
      Serial.write(buffer, bytesRead);
    }
  }
  yield();
}

I'm using this script in Linux to test:

#!/bin/bash
socat pty,link=ttyV0,wait-slave tcp4:192.168.27.171:23&
sleep 1
stty -F ttyV0 115200 raw -echo
stty -F /dev/ttyUSB0 115200 raw -echo
cat /dev/ttyUSB0 > DMESG_OUT&
cat DMESG > ttyV0
sleep 10
kill %1
kill %2
diff DMESG DMESG_OUT | head
mv DMESG_OUT DMESG_OUT.wifi_to_serial

With this file: DMESG.txt (rename to DMESG)

The failure seems intermittent, but after a few runs it will fail.

Debug Messages

Instead of this:

[    0.000000] Linux version 6.9.9-arch1-1 (linux@archlinux) (gcc (GCC) 14.1.1 20240522, GNU ld (GNU Binutils) 2.42.0) #1 SMP PREEMPT_DYNAMIC Fri, 12 Jul 2024 00:06:53 +0000
[    0.000000] Command line: options root=/dev/mapper/main rootflags=defaults,noatime,compress=lzo,subvol=boot_root lsm=landlock,lockdown,yama,apparmor,bpf
[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009ffff] usable
[    0.000000] BIOS-e820: [mem 0x00000000000a0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000009cfefff] usable
[    0.000000] BIOS-e820: [mem 0x0000000009cff000-0x0000000009ffffff] reserved
[    0.000000] BIOS-e820: [mem 0x000000000a000000-0x000000000a1fffff] usable
[    0.000000] BIOS-e820: [mem 0x000000000a200000-0x000000000a210fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x000000000a211000-0x000000000affffff] usable
[    0.000000] BIOS-e820: [mem 0x000000000b000000-0x000000000b01ffff] reserved
[    0.000000] BIOS-e820: [mem 0x000000000b020000-0x00000000c3283fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000c3284000-0x00000000c3284fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000c3285000-0x00000000ca2e0fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000ca2e1000-0x00000000ca644fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000ca645000-0x00000000ca681fff] ACPI data
[    0.000000] BIOS-e820: [mem 0x00000000ca682000-0x00000000cad4dfff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000cad4e000-0x00000000cb9fefff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000cb9ff000-0x00000000ccffffff] usable
[    0.000000] BIOS-e820: [mem 0x00000000cd000000-0x00000000cfffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000f8000000-0x00000000fbffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fd200000-0x00000000fd2fffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fd600000-0x00000000fd7fffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fea00000-0x00000000fea0ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000feb80000-0x00000000fec01fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fec10000-0x00000000fec10fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fec30000-0x00000000fec30fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed00000-0x00000000fed00fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed40000-0x00000000fed44fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed80000-0x00000000fed8ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fedc2000-0x00000000fedcffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fedd4000-0x00000000fedd5fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000082f37ffff] usable
[    0.000000] BIOS-e820: [mem 0x000000082f380000-0x000000082fffffff] reserved
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] APIC: Static calls initialized
[    0.000000] e820: update [mem 0xbe079018-0xbe087057] usable ==> usable
[    0.000000] e820: update [mem 0xbe05a018-0xbe078057] usable ==> usable
[    0.000000] extended physical RAM map:

I get this:

.9-arch1-1 (linux@archlinux) (gc(GNU Binutils) 2.42.0) #1 SMP PRmand line: options root=/dev/mapime,compress=lzo,subvol=boot_rooarmor,bpf
[    0.000000] BIOS-pr000000] BIOS-e820: [mem 0x000000sable
[    0.000000] BIOS-e820: 00000fffff] reserved
[    0.0000100000-0x0000000009cfefff] usabl 0x0000000009cff000-0x0000000009BIOS-e820: [mem 0x000000000a0000   0.000000] BIOS-e820: [mem 0x0ff] ACPI NVS
[    0.000000] BIOSx000000000affffff] usable
[    00000b000000-0x000000000b01ffff] 0: [mem 0x000000000b020000-0x000000] BIOS-e820: [mem 0x00000000crved
[    0.000000] BIOS-e820: [0ca2e0fff] usable
[    0.000000]000-0x00000000ca644fff] reserved0x00000000ca645000-0x00000000ca600-0x00000000cad4dfff] ACPI NVS
x00000000cad4e000-0x00000000cb9fOS-e820: [mem 0x00000000cb9ff000 0.000000] BIOS-e820: [mem 0x000] reserved
[    0.000000] BIOS-e0000000fbffffff] reserved
[    0000fd200000-0x00000000fd2fffff] 0: [mem 0x00000000fd600000-0x00000000] BIOS-e820: [mem 0x0000000served
[    0.000000] BIOS-e820:000fec01fff] reserved
[    0.000ec10000-0x00000000fec10fff] resemem 0x00000000fec30000-0x00000000] BIOS-e820: [mem 0x00000000feded
[    0.000000] BIOS-e820: [meed44fff] reserved
[    0.000000]000-0x00000000fed8ffff] reserved0x00000000fedc2000-0x00000000fedIOS-e820: [mem 0x00000000fedd400    0.000000] BIOS-e820: [mem 0xfff] reserved
[    0.000000] BIO0x000000082f37ffff] usable
[    00082f380000-0x000000082fffffff]cute Disable) protection: activels initialized
[    0.000000] e887057] usable ==> usable
[    0.a018-0xbe078057] usable ==> usabcal RAM map:
Jookia commented 2 months ago

I did some more testing and it seems like this isn't related to Wi-Fi at all. This example will corrupt on output (looping to print numbers and a new line):

void setup() {
  Serial.begin(115200);
}

void loop() {
  for(int i = 0; i < 10; ++i)
    Serial.write(0x30 + i);
  Serial.write("\r\n");
  //delay(10); // adding delay 'fixes'
}

Running this in Linux will print broken lines:

stty -F /dev/ttyUSB0 raw 115200
head -100 /dev/ttyUSB0 | grep -v '0123456789'
Jookia commented 2 months ago

Adding a non-scheduling delay to wait for the FIFO to clear seems to fix this problem for the simple sketch:

void setup() {
  Serial.begin(115200);
}

void loop() {
  for(int i = 0; i < 10; ++i) {
    Serial.write(0x30 + i);
    ets_delay_us(100);
  }
  Serial.write("\r\n");
}

Unfortunately trying to integrate this in to the Wi-Fi example doesn't work, I still get data loss. Making the baud higher to 1.5M gives the exact same corrupted output, even with a delay of 100us.

mcspr commented 2 months ago

What about Serial.flush() before writing next chunk of data?

Jookia commented 2 months ago

I'll give it a look. I took out a logic analyzer and probed the output and it looks like the ESP8266 is outputting all data correctly, but the CH340 giving corrupted data over USB and serial. I tried an FTDI cable and had the same results.

Looking at the programmer, there's 6.7k between TX and GND and 4.7k between TX and 3.3v. I'm guessing what's happening here is a hardware design fault. Sorry for the noise.