esp8266 / Arduino

ESP8266 core for Arduino
GNU Lesser General Public License v2.1
15.96k stars 13.34k forks source link

Serial.flush() not flushing correctly at 160Mhz? #2536

Closed themindfactory closed 5 years ago

themindfactory commented 7 years ago

Hardware

Hardware: ESP-12E Core Version: 2.3.0

Description

When I run a sketch at 80Mhz all is OK 160Mhz brings in new issues.

I send out about 3000 bytes of data to serial with:

ptr = s;
while (*ptr) {
   Serial.write(*ptr++);
   // just in case its very long
   yield();
}

then I do with no delay

Serial.flush();
//delay(10);
Serial.swap();

I end up with an odd character on the serial port 0xEA when I expected a 0x0A (I guess we call it the default serial port on pins 1 and 3)

if I add a delay(10) after the flush all is OK.... its like the flush does not get to complete.

RichardS

Settings in IDE

Module: Generic ESP8266 Module Flash Size: 4MB/1MB CPU Frequency: 160Mhz Flash Mode: dio Flash Frequency: 40Mhz Upload Using: SERIAL Reset Method: nodemcu

WereCatf commented 7 years ago

This is something I've seen a whole bunch of people complaining about lately and I tried to look through the sources, but I can't find the problem there. I don't know if I'm just missing it, or if it's an actual H/W-side problem? Just out of curiosity, have you tried with a smaller value to delay(), and if so, what's the smallest delay you can use that still ends in the correct results? Does delay(1) work, for example, or does it then spit wrong data out?

stickbreaker commented 7 years ago

The flush() call should wait until the uart issues the TXComplete interrupt. On the AVR ATMega328p this is the TXCn bit of the UCSRnA register. I looked at the code in the HardwareSerial library, but I was not able to understand/decode how to find the equivalent status bit for the ESP. Is there any documentation available for the ESP UART? Chuck.

themindfactory commented 7 years ago

Very interesting....

So I tried: delay(1) and it worked delayMicroseconds(10) and it worked delayMicroseconds(9) and it worked delayMicroseconds(8) and it failed

That's almost exactly one bit time at 115200! == 8.7uS

RichardS

On Sun, Sep 18, 2016 at 11:54 PM, chuck todd notifications@github.com wrote:

The flush() call should wait until the uart issues the TXComplete interrupt. On the AVR ATMega328p this is the TXCn bit of the UCSRnA register. I looked at the code in the HardwareSerial library, but I was not able to understand/decode how to find the equivalent status bit for the ESP. Is there any documentation available for the ESP UART? Chuck.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/esp8266/Arduino/issues/2536#issuecomment-247907450, or mute the thread https://github.com/notifications/unsubscribe-auth/AINdaXib7uNCLMyv3OR-MC3eoD-01OG9ks5qrgeNgaJpZM4J_8ll .

WereCatf commented 7 years ago

@stickbreaker https://github.com/esp8266/Arduino/blob/master/cores/esp8266/uart.c -- uart_wait_tx_empty does wait until the corresponding register changes status, but apparently when you run the ESP8266 at 160MHz the register changes status too early, leaving one more bit to be transferred.

@themindfactory That's exactly what I wanted to know. A workaround could be construed then that calculates how long the extra delay at the end of flush() should be when you set the baudrate of the serial-bus if the CPU is running at 160MHz and when a call to uart_wait_tx_empty() happens it'll wait that much so as to ensure the buffer is actually empty.

WereCatf commented 7 years ago

This may actually be a rounding-error. On line 238 in uart.c there's this line: USD(uart->uart_nr) = (ESP8266_CLOCK / uart->baud_rate);

Alas, when you run at 160MHz and 115200 baudrate it returns the value 1388, instead of 1389 -- the actual value would be ~1388,888... , but it's rounded down instead of up.

themindfactory commented 7 years ago

Further experimenting at 9600 baud reveals its actually 8 bit times, probably due to the uS timing at the 115200 baud rate the other time functions take to execute can come into play... but 9600 work at 850uS delay and failed at 800uS, one bit time is 104uS so 8 bits is 832uS....

themindfactory commented 7 years ago

The real problem is that is 1 bit time.... we need a x8 in there and probably a x9 for safety :-)

EDIT sorry your 1388 is one bit time.

stickbreaker commented 7 years ago

So, the interrupt you are keying off is equivilent to UDREn - USART Data Register Empty. And you are having to simulate the TXCn - Transmit Complete. That makes sense, just code the delay with at least 9 maybe 10 bit times after UDREn, incase someone uses 8O2 or 9bit. Does this hardware support 9bit MPCMn Multi-Processor Communications Mode? If so, it could be up to 13bits per byte. Start, 9 data, parity, 2 stop. Chuck.

fishersolutions commented 7 years ago

I stuck this problem with Modbus RTU running on RS485 at 9600,n,1. The transceiver get switched off after calling flush(); and the last bit is never sent. This is on a NodeMCU V1.0 running at 80MHz. Adding in a delay fixes the problem.

Is there are more elegant fix some where?

themindfactory commented 7 years ago

Never found one in this case I add delay(1) not elegant but it always works which is the exactly what you need. I found it had one chapter to send not one bit, so that's 10 bits

On Feb 12, 2017 8:20 PM, "David Fisher" notifications@github.com wrote:

I stuck this problem with Modbus RTU running on RS485 at 9600,n,1. The transceiver get switched off after calling flush(); and the last bit is never sent. This is on a NodeMCU V1.0 running at 80MHz. Adding in a delay fixes the problem.

Is there are more elegant fix some where?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/esp8266/Arduino/issues/2536#issuecomment-279273264, or mute the thread https://github.com/notifications/unsubscribe-auth/AINdaYehNQqf3FWnmz_0IGvNSRFnq2zGks5rb6_ugaJpZM4J_8ll .

sopan-sarkar commented 7 years ago

I am having a similar type of problem. I am reading 14bytes of data from a RFID tag. When I use software serial to read the bytes at 80 MHz everything works well, but when I read it at 160 MHz it starts to read values that doesn't corresponds to the tag value. In other words, it reads unwanted values. Moreover, the number of bytes read is only 6, whereas it was supposed to be 14. I tried to add delays but nothing works.

RudyFiero commented 6 years ago

From my past experience with other microcontrollers is that tx_buffer_empty means just that. It does not mean the byte is out of the chip. It has been transferred to the shift register and still needs to be shifted out.

void uart_wait_tx_empty(uart_t* uart)
{
    if(uart == NULL || !uart->tx_enabled) {
        return;
    }
    while(((USS(uart->uart_nr) >> USTXC) & 0xff) > 0) {
        delay(0);
    }
}

define USTXC 16 //TX FIFO COUNT (8bit)

To me this flag is only indicating the state of the buffer. Not the completion of transmission.

devyte commented 6 years ago

My experience is similar to @RudyFiero . @themindfactory @WereCatf from your comments, it seems that a definitive solution was never found. Should a delay(1) be added to the end of flush()?

grafalex82 commented 6 years ago

I stuck this problem with Modbus RTU running on RS485 at 9600,n,1. The transceiver get switched off after calling flush(); and the last bit is never sent. This is on a NodeMCU V1.0 running at 80MHz. Adding in a delay fixes the problem.

Is there are more elegant fix some where? I hit exactly same thing with Modbus RTU.

Speaking about delay() which works around the issue: I doubt delay(1) is a correct delay, as delay must correspond selected bitrate, parity bit, stop bits, etc. But again, this is only work around.

I think the correct way would be to wait until byte is really transferred by polling a flag or waiting for an interrupt. Unfortunately I am not familiar with ESP internals if there is such flag

WereCatf commented 6 years ago

@devyte Adding a delay to flush() seems like the best proposed workaround for now, but then the question becomes, should the delay only take effect when running at 160MHz and should we just use a hard-coded 1ms delay or base the delay on the transmission-speed? It seems that a delay that's enough to transfer 4 bits should be enough, if one were to base it on the transmission-speed.

WereCatf commented 6 years ago

@grafalex82 As I mentioned earlier in the thread, there is an internal register that's supposed to do exactly that, but for some reason (hardware - bug?) it toggles too early when running at 160MHz.

devyte commented 6 years ago

@WereCatf I'd love to have a finer grained delay, but delay(1) is the smallest that yields. Going smaller means delayMicroseconds, which doesn't yield. If we're going with a delay, then I think it's the best workaround. What about that rounding error mentioned?

WereCatf commented 6 years ago

@devyte I don't remember anymore, the discussion happened like a year ago and I just can't recall the details now. I could take another stab at it with fresh eyes, but the final solution would probably still end up being a delay in flush(), so I don't know if it's worth the effort.

vdeconinck commented 6 years ago

@devbyte Re: going smaller than delay(1): Can't we call both delayMicroseconds() and yield()? Just a thought...

igrr commented 6 years ago

Why would you need to yield in flush()? Unlike read, write, and available, it's hard to imagine flush() being called in a tight loop for a significant amount to time. I think delayMicroseconds should be perfectly fine here.

WereCatf commented 6 years ago

I noticed that neither swap() in HardwareSerial.cpp or uart_swap() in uart.c called by swap() don't wait for transmission to end before swapping the pins used; should we add a call to flush() there? Personally, I think that would be prudent.

WereCatf commented 6 years ago

After a very brief test, adding delayMicroseconds((int)(1000000.0 / uart_get_baudrate(_uart) * 11 + 1)); in flush() seems to fix the issue and works across a multitude of different baudrates.

devyte commented 6 years ago

@WereCatf interesting. Also, I came across #2502 . I have to wonder about the difference of the proposed solution?

WereCatf commented 6 years ago

What difference? He's waiting for 12 bits to be transferred, I am waiting for 11, and we calculate it a bit differently, but otherwise it's the same deal.

WereCatf commented 6 years ago

Oh, one thing: I checked and this bug happens when running at 80MHz, too. I don't know what OP did differently, but I am getting the exact same results at 80MHz and 160MHz.

d-a-v commented 5 years ago

@themindfactory @WereCatf This issue is stilled opened, now assigned to me, and the relevant PR is merged and seems valid, waiting for 11 bits @ serial speed +1uS (I would have used 12 (1start+8max+parity+2stop) Should we close ?

WereCatf commented 5 years ago

I have no objections, the fix I posted has worked fine for me and I haven't seen anyone complaining about it, either.

On 12. lokakuuta 2018 14.56.58 david gauchard notifications@github.com wrote:

@themindfactory @WereCatf This issue is stilled opened, now assigned to me, and the relevant PR is merged and seems valid, waiting for 11 bits @ serial speed +1uS (I would have used 12 (1start+8max+parity+2stop) Should we close ?— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

d-a-v commented 5 years ago

Thanks, closing as fixed by #3714