sparkfun / Arduino_Apollo3

Arduino core to support the Apollo3 microcontroller from Ambiq Micro
83 stars 37 forks source link

Pulling Serial1 RX low causes Artemis to hang with v2.1 of the core #423

Closed PaulZC closed 2 years ago

PaulZC commented 3 years ago

Hi gang,

Another interesting Apollo3 v2.1 quirk:

Pulling the Serial1 RX pin low causes the Artemis to hang with v2.1 of the core...

Steps to reproduce:

Redboard Artemis ATP Run the following sketch Use a jumper wire to pull the RX1 pin to GND With v2.1.0 and v2.1.1 of the core, the Artemis hangs permanently - even when the jumper wire is removed With v1.2.1, the code keeps on running

I haven't yet figured out why this happens, or exactly where the code hangs. I'm hoping @paulvha may have some insight on this (thank you in advance - as always).

I know that holding a serial RX pin low is 'illegal' (manifesting as a continuous start bit?) but folks need to know about this just in case the thing they have attached to the RX pin is (e.g.) powered down and pulls the pin low.

I'm kicking myself, because I actually noticed this a few weeks ago when I was upgrading OpenLog Artemis. I noticed that Serial1 RX would hang after deep sleep if nothing was connected to the RX pin. I eventually figured out that the (weak?) pull-up on the RX pin is removed after deep sleep. If the pin then floats low, the code hangs. I worked around this at the time by manually re-enabling the pull-up on the RX pin after deep sleep. I guess this might explain some of the behavior seen in #411?

Best wishes, Paul (ZC)

// RedBoard Artemis ATP - Serial1 Rx test
//
// Let the code run for a few seconds,
// then use a jumper wire to pull Serial1 RX (RX1) low
//
// With v2.1.0 & v2.1.1 of Apollo3, pulling RX1 low causes the Artemis to hang...
// With v1.2.1, the Artemis keeps on running...

void setup() {
  Serial.begin(115200);
  Serial1.begin(115200);
}

void loop() {
  Serial.print(F("Serial1.available is "));
  Serial.println(Serial1.available());
  delay(500);
}
PaulZC commented 3 years ago

Running the following sketch shows that holding the RX1 pin low for ~9.5 bit periods causes the hang. I.e. half way through the stop bit. (1 start bit, eight data bits, no parity, 1 stop bit). So I guess there must be some subtle difference in the way the UART is configured in v2.1 compared with v1.2?

// RedBoard Artemis ATP - Serial1 Rx test
//
// Use a jumper wire to link RX1 to pin 14

int delayBy = 900;

void setup() {
  pinMode(14, OUTPUT);
  digitalWrite(14, HIGH);

  Serial.begin(115200);

  Serial1.begin(9600);
}

void loop() {
  delayBy++;

  digitalWrite(14, LOW);
  delayMicroseconds(delayBy);
  digitalWrite(14, HIGH);

  delay(50);

  Serial.print(F("delayBy is "));
  Serial.print(delayBy);
  Serial.print(F("  Serial1.available is "));
  Serial.println(Serial1.available());
  Serial.flush();
}

At 9600 baud, the code hangs when delayBy reaches 983 microseconds = 9.4 bits:

image

At 4800 baud and starting with delayBy = 1900, the code hangs when delayBy reaches 1981 microseconds = 9.5 bits:

image

paulvha commented 3 years ago

Looked at this issue before, it is described + solution in this issue : https://github.com/sparkfun/Arduino_Apollo3/issues/349

regards, Paulvha

PaulZC commented 3 years ago

Thank you Paul (@paulvha ),

I think this issue might be the cause of problems we've been seeing on the Artemis Global Tracker. I upgraded the firmware to use v2.1.0 of the core recently and some customers have reported the new code hanging. We have the Iridium modem connected to Serial1. Power for the modem is switched by a separate circuit controlled by a GPIO pin. I think the modem may be pulling the RX pin low briefly as it powers up causing the code to hang. I suspect that adding a pull-up resistor in this case may not work but I will give it a try!

Thanks again, Paul

PaulZC commented 3 years ago

Some progress... Disabling the RX pin seems to avoid the issue.

This sketch keeps running on v2.1.0:

// RedBoard Artemis ATP - Serial1 Rx test
//
// Use a jumper wire to link RX1 to pin 14

void setup() {
  pinMode(14, OUTPUT);
  digitalWrite(14, HIGH);

  Serial.begin(115200);
}

void loop() {
  configureSerial1TxRx(D24, AM_HAL_PIN_24_UART1TX, D25, AM_HAL_PIN_25_UART1RX); // Select the standard ATP pins for UART1 TX and RX
  Serial1.begin(115200);

  delay(100);

  Serial.print(F("Serial1.available is "));
  Serial.println(Serial1.available());
  Serial.flush();

  am_hal_gpio_pinconfig(PinName(D25), g_AM_HAL_GPIO_DISABLE); // Disable the RX pin

  delay(100);

  digitalWrite(14, LOW);

  delay(100);

  digitalWrite(14, HIGH);

  delay(100);
}

void configureSerial1TxRx(PinName txPin, uint8_t txPinFuncSel, PinName rxPin, uint8_t rxPinFuncSel) // Configure pins for UART1 TX and RX
{
  am_hal_gpio_pincfg_t pinConfigTx = g_AM_BSP_GPIO_COM_UART_TX;
  pinConfigTx.uFuncSel = txPinFuncSel;
  pin_config(txPin, pinConfigTx);

  am_hal_gpio_pincfg_t pinConfigRx = g_AM_BSP_GPIO_COM_UART_RX;
  pinConfigRx.uFuncSel = rxPinFuncSel;
  pinConfigRx.ePullup = AM_HAL_GPIO_PIN_PULLUP_WEAK; // Put a weak pull-up on the Rx pin
  pin_config(rxPin, pinConfigRx);
}
PaulZC commented 3 years ago

I think we're in business!

This sketch:

// RedBoard Artemis ATP - Serial1 Rx test
//
// Use a jumper wire to link RX1 to pin 14

void setup() {
  pinMode(14, OUTPUT);
  digitalWrite(14, HIGH);

  Serial.begin(115200);
  Serial1.begin(100);
}

void loop() {
  configureSerial1TxRx(D24, AM_HAL_PIN_24_UART1TX, D25, AM_HAL_PIN_25_UART1RX); // Select the standard ATP pins for UART1 TX and RX

  delay(100);

  // Just for fun, bit-bang the ASCII character "0" (0x30, LSB-first = 0b00001100) on pin 14
  digitalWrite(14, LOW); delay(10); // Start bit. Bit period at 100 baud is 10 milliseconds
  digitalWrite(14, LOW); delay(40); // Four bits
  digitalWrite(14, HIGH); delay(20); // Two bits
  digitalWrite(14, LOW); delay(20); // Two bits
  digitalWrite(14, HIGH); delay(10); // Stop bit. Bit period at 100 baud is 10 milliseconds

  delay(100);

  Serial.print(F("Serial1.available is "));
  Serial.print(Serial1.available());
  Serial.print(F("   Serial1.read returned \""));
  Serial.write(Serial1.read());
  Serial.println(F("\""));
  Serial.flush();

  am_hal_gpio_pinconfig(PinName(D25), g_AM_HAL_GPIO_DISABLE); // Disable the RX pin

  delay(100);

  digitalWrite(14, LOW);

  delay(2000); // Wait >> 9.5 bit periods

  digitalWrite(14, HIGH);

  delay(100);
}

void configureSerial1TxRx(PinName txPin, uint8_t txPinFuncSel, PinName rxPin, uint8_t rxPinFuncSel) // Configure pins for UART1 TX and RX
{
  am_hal_gpio_pincfg_t pinConfigTx = g_AM_BSP_GPIO_COM_UART_TX;
  pinConfigTx.uFuncSel = txPinFuncSel;
  pin_config(txPin, pinConfigTx);

  am_hal_gpio_pincfg_t pinConfigRx = g_AM_BSP_GPIO_COM_UART_RX;
  pinConfigRx.uFuncSel = rxPinFuncSel;
  pinConfigRx.ePullup = AM_HAL_GPIO_PIN_PULLUP_WEAK; // Put a weak pull-up on the Rx pin
  pin_config(rxPin, pinConfigRx);
}

produces:

image

PaulZC commented 3 years ago

Could this be the explanation?

In am_hal_uart.c buffer_configure both the RX and RX_TMOUT interrupts are enabled:

https://github.com/sparkfun/mbed-os-ambiq-apollo3/blob/90230bb74485d774d5dba5550297d2804724fb7e/targets/TARGET_Ambiq_Micro/TARGET_Apollo3/sdk/mcu/apollo3/hal/am_hal_uart.c#L592-L593

But in serial_api.c serial_irq_set only the RX interrupt is enabled:

https://github.com/sparkfun/mbed-os-ambiq-apollo3/blob/90230bb74485d774d5dba5550297d2804724fb7e/targets/TARGET_Ambiq_Micro/TARGET_Apollo3/device/serial_api.c#L229

If pulling the RX pin low causes an RX_TMOUT, does this mean the UART will hang because the interrupt is not enabled and not being serviced?

I suspect we need to change line 229:

MBED_ASSERT(am_hal_uart_interrupt_enable(obj->serial.uart_control->handle, AM_HAL_UART_INT_RX) == AM_HAL_STATUS_SUCCESS);

to:

MBED_ASSERT(am_hal_uart_interrupt_enable(obj->serial.uart_control->handle, AM_HAL_UART_INT_RX | AM_HAL_UART_INT_RX_TMOUT) == AM_HAL_STATUS_SUCCESS);

And in line 242, change:

MBED_ASSERT(am_hal_uart_interrupt_disable(obj->serial.uart_control->handle, AM_HAL_UART_INT_RX) == AM_HAL_STATUS_SUCCESS);

to:

MBED_ASSERT(am_hal_uart_interrupt_disable(obj->serial.uart_control->handle, AM_HAL_UART_INT_RX | AM_HAL_UART_INT_RX_TMOUT) == AM_HAL_STATUS_SUCCESS);

Thoughts please...

Wenn0101 commented 3 years ago

@PaulZC I'll be back working on Monday, I'll take a look first thing!

paulvha commented 3 years ago

Same here...

PaulZC commented 3 years ago

Thank you both! No rush... Very best wishes, Paul

Wenn0101 commented 3 years ago

Alright. Here is what I found so far.

It looks like when the RX line is going low we are hanging in the UART::rxISR.

    char c;
    while(UnbufferedSerial::readable()) {
        UnbufferedSerial::read(&c, 1);
        _rxbuf.store_char(c);
    }

More specifically the UnbufferedSerial::readable() is returning true (erroneously), and we are getting stuck in UnbufferedSerial::read(&c, 1);. The read function is a blocking function that does not exit until we have read atleast 1 byte, since the line is just being held low there is no data to read and we just sit there.

Digging into the mbed driver the read is utilizing the serial_getc implementation. which is hanging in this loop:

 do {
    am_hal_uart_transfer(obj->serial.uart_control->handle, &am_hal_uart_xfer_read_single);
  } while (bytes_read == 0);

We don't have much control over this behavior since we can see above:

 * # Defined behavior:
 * ...
 * serial_getc is a blocking call (waits for the character).

The problem really seems to be that the call before this ( UnbufferedSerial::readable() which thru mbed will arrive at our driver implementation of serial_readable) is returning true. This code is:

int serial_readable(serial_t *obj) {
  MBED_ASSERT(obj->serial.uart_control != NULL);
  return !(UARTn(obj->serial.uart_control->inst)->FR_b.RXFE);
}

So RXFE (according to the datasheet "This bit holds the receive FIFO empty indicator") is changing to false. indicating that the FIFO of the hardware serial is not empty... then we are attempting to read the FIFO, finding it empty and waiting for it to not be empty...

I have confirmed this is the problem by commenting out the body of the UART::rxISR and then running this sketch.

void setup() {
  Serial.begin(115200);
  Serial1.begin(115200);
}

void loop() {
  Serial.print("Uart 0 RXFE ");
  Serial.println(UARTn(0)->FR_b.RXFE);
  Serial.print("Uart 1 RXFE ");
  Serial.println(UARTn(1)->FR_b.RXFE);
  delay(500);
}

When I jump RX to ground the RXFE register swaps to 0.

So I have found what I believe to be the root cause and now I'm working on figuring out a solution.

PaulZC commented 3 years ago

Reading between the lines here, I think the AM_HAL_UART_INT_RX_TMOUT interrupt must enabled because the ISR is being called after the RX pin has been pulled low for ~9.5 bit periods (see above). When we hold the pin low beyond the 9th bit period (1 start bit, 8 data bits) we go into an 'illegal' state, the RX line should be high (1 stop bit) but it isn't and I'm guessing an RX_TMOUT interrupt is generated.

Again, reading between the lines, I'm guessing that v1.2 was OK with the RX pin being pulled low because the ISR checked the AM_HAL_UART_INT_RX interrupt flag was set before performing the am_hal_uart_transfer. If the AM_HAL_UART_INT_RX interrupt flag was not set (but presumably AM_HAL_UART_INT_RX_TMOUT was?) then all was well as the interrupt was still being cleared and serviced, and am_hal_uart_transfer was not called:

https://github.com/sparkfun/Arduino_Apollo3/blob/87a33ce12049ff10c6d0d7e6ab9038e4d874b36e/cores/arduino/ard_sup/uart/ap3_uart.cpp#L553

The code never got as far as calling uart_fifo_read where the RXFE flag was checked:

https://github.com/sparkfun/Arduino_Apollo3/blob/87a33ce12049ff10c6d0d7e6ab9038e4d874b36e/cores/arduino/am_sdk_ap3/mcu/apollo3/hal/am_hal_uart.c#L632

Can we change serial_readable so it checks the RX_TMOUT flag and/or the RX flag before checking RXFE?

As a nasty thought experiment, what would happen if a valid serial byte was received (with a proper stop bit) followed immediately by the RX pin being pulled low for 10+ bits? Maybe, if that happens, whatever is in the FIFO should be discarded?

Thanks for digging into this...!

paulvha commented 3 years ago

I have extended your sketch to include reading the DR-status bits. Turns out an error has occurred ( parity, framing, overrun or break) when connecting a wire to GND.

  Serial.print("Uart 1 DR- FEDATA ");
  Serial.println(UARTn(1)->DR_b.FEDATA);
  Serial.print("Uart 1 DR- PEDATA ");
  Serial.println(UARTn(1)->DR_b.PEDATA);
  Serial.print("Uart 1 DR- BEDATA ");
  Serial.println(UARTn(1)->DR_b.BEDATA);
  Serial.print("Uart 1 DR- OEDATA ");
  Serial.println(UARTn(1)->DR_b.OEDATA);

I had included the serial.print() in UART: rxISR and in case of an error when we try to read the character it will "hang".

The reason is that thru MBED, Serial_getc(in Serial_API.c), am_hal_uart_transfer (in am_hal_uart.c), read_timeout(in am_hal_uart.c), read_nonblocking (in am_hal_uart.c) we end up in uart_fifo_read ((in am_hal_uart.c). Here it performs a check on the error bits and if detected it will return an error AND will NOT increase the number of bytes read. The returned error code is neglected in Serial_getc it will only check that data has been received :

  do {
    am_hal_uart_transfer(obj->serial.uart_control->handle, &am_hal_uart_xfer_read_single);
  } while (bytes_read == 0);

Now you got your "hang".

Unfortunately, the error code can NOT be passed back thru MBED, only the read character is passed back. Passing back a 0x00 or 0xFF can not be done as that might be valid data.

Potential solution Hence we should extend the while loop in UART:rxISR to something like

    while(UnbufferedSerial::readable()) {

        if (UARTn(ui32Module)->DR_b.FEDATA | UARTn(ui32Module)->DR_b.PEDATA |
            UARTn(ui32Module)->DR_b.BEDATA | UARTn(ui32Module)->DR_b.OEDATA)
        {
            // ignore the character in error
            UARTn(ui32Module)->DR);
        }
        else {
            UnbufferedSerial::read(&c, 1);
            _rxbuf.store_char(c);
        }
    }

We need to define the UART module in HardwareSerial.h, setting default as 1 as the UART1 might be mapped to different pins: uint32_t ui32Module = 1;

We need to add a line to determine that UART module 0 is used:

UART::UART( void ) :
    UART(STDIO_UART_TX, STDIO_UART_RX)
{
  ui32Module = 0
}

regards, Paul

PaulZC commented 3 years ago

Thank you Paul - I really appreciate your thorough detective work (as always), Paul

Wenn0101 commented 3 years ago

Arg. So I end up missing characters when I try to make this solution work. That is, this fixes the problem... but it seems to introduce problems in normal communications. I think this is because reading this register makes it so that it is busy and the data doesn't end up populating it... I can't figure out exactly what it is, but this new problem happens when you try to read Uart->DR_b before doing a read, even if there is no error in the error bits. This happens regardless of baud rates.

This makes me think that a solution like this can't work, but maybe I am missing something obvious. I am tempted to just have the getc function in the serial driver return 0 (or FF or 55 or something) when it hits an error like this. As you pointed out this seems like a pretty bad solution since this could be misinterpreted as valid data, but it sure beats just hanging in this function in the middle of an ISR....

do {
    uint32_t status = am_hal_uart_transfer(obj->serial.uart_control->handle, &am_hal_uart_xfer_read_single);
    if(status != AM_HAL_STATUS_SUCCESS)
    {
      return 0;
    }
  } while (bytes_read == 0);

  return (int)rx_c;

The other solution would be to just ditch the use of UnbufferedSerial::read in the ISR. This way I could implement something closer to the v1.2 core which does not sit in a while loop waiting for a character, but rather just attempts a read once. I was leaning towards this solution for a while, but the transfer would require that I have and store the handle for the UART in this class, which feels like I would be cutting thru multiple layers of abstraction in a rather gross way.

Thoughts?

Wenn0101 commented 3 years ago

I forgot to mention above, if you want to reproduce the problems I speak of. Try example 4_serial included with the core.

paulvha commented 3 years ago

I have tested and apparently, when reading part of the data register (the status bits), the character is also assumed to be read and removed from the FIFO. Instead of reading the status from the DR, we can also read the status from the STATUS register.

if (UARTn(ui32Module)->RSR_b.FESTAT | UARTn(ui32Module)->RSR_b.PESTAT |
            UARTn(ui32Module)->RSR_b.BESTAT | UARTn(ui32Module)->RSR_b.OESTAT)

I have checked this also and now the correct data is passed on with the code in UART::rxISR

    char c;
    while(UnbufferedSerial::readable()) {

        if (UARTn(ui32Module)->RSR_b.FESTAT | UARTn(ui32Module)->RSR_b.PESTAT |
            UARTn(ui32Module)->RSR_b.BESTAT | UARTn(ui32Module)->RSR_b.OESTAT)
        {
            // ignore the character in error. 
            UARTn(ui32Module)->DR;
        }
        else {
            UnbufferedSerial::read(&c, 1);
            _rxbuf.store_char(c);
        }
    }

regards, Paul

Wenn0101 commented 3 years ago

Ahh, good catch. I got too focused on it being a timing issue! I am testing this out now.

I am also considering moving this check into the serial_readable function in the mbed driver. I have not tested it yet, but I imagine that the BufferedSerial in mbed will have this exact same problem, as the ISR follows the same logic.

something along the lines of

int serial_readable(serial_t *obj) {
  MBED_ASSERT(obj->serial.uart_control != NULL);

    if (UARTn(obj->serial.uart_control->inst)->RSR_b.FESTAT |
        UARTn(obj->serial.uart_control->inst)->RSR_b.PESTAT |
        UARTn(obj->serial.uart_control->inst)->RSR_b.BESTAT |
        UARTn(obj->serial.uart_control->inst)->RSR_b.OESTAT)
    {
        // ignore the character in error. 
        UARTn(obj->serial.uart_control->inst)->DR;
        return false;
    }

  return !(UARTn(obj->serial.uart_control->inst)->FR_b.RXFE);
}
paulvha commented 3 years ago

Could be the right place. I would make a small change and first check there is a character in the FIFO before checking the status. something like :

int serial_readable(serial_t *obj) {
  MBED_ASSERT(obj->serial.uart_control != NULL);

  if !(UARTn(obj->serial.uart_control->inst)->FR_b.RXFE) {

    if (UARTn(obj->serial.uart_control->inst)->RSR_b.FESTAT |
        UARTn(obj->serial.uart_control->inst)->RSR_b.PESTAT |
        UARTn(obj->serial.uart_control->inst)->RSR_b.BESTAT |
        UARTn(obj->serial.uart_control->inst)->RSR_b.OESTAT)
    {
        // ignore the character in error. 
        UARTn(obj->serial.uart_control->inst)->DR;
        return false;
    }
  }

  return !(UARTn(obj->serial.uart_control->inst)->FR_b.RXFE);
}

Regards, Paul

paulvha commented 3 years ago

Thinking a bit more about adding in Serial_readable(), the proposed setup could still cause an issue if there are 2 characters after each other in the FIFO in error. The first one is caught, but the second is not. So maybe change to :

int serial_readable(serial_t *obj) {
  MBED_ASSERT(obj->serial.uart_control != NULL);

  // as long as character available in FIFO
  while (!(UARTn(obj->serial.uart_control->inst)->FR_b.RXFE) {

    // if in error
    if (UARTn(obj->serial.uart_control->inst)->RSR_b.FESTAT |
        UARTn(obj->serial.uart_control->inst)->RSR_b.PESTAT |
        UARTn(obj->serial.uart_control->inst)->RSR_b.BESTAT |
        UARTn(obj->serial.uart_control->inst)->RSR_b.OESTAT)
    {
        // ignore the character in error. 
        UARTn(obj->serial.uart_control->inst)->DR;
    }
    else
    {
    return true;
    }
  }

  // no valid character in FIFO available
  return false;
}

regards, Paul

Wenn0101 commented 3 years ago

Alright, this change is in the v2.2.0 pre-release.

@PaulZC Are there any other things that you were hoping I would include in the v2.2.0 release for the global tracker?

PaulZC commented 3 years ago

Hi @Wenn0101 ,

Excellent stuff - thank you! Undoing the 'fix' for #400 was important - thank you for including that. I have work-arounds for #416, #412 and #411. No worries there.

407 is more applicable to OpenLog Artemis, less so for Global Tracker.

410 is just a quirk - it's just something folks need to be aware of if they are using millis and / or micros. Using the RTC to keep track of time when using deep sleep is the way to go.

I think we're all good. I will give 2.2 a try as soon as I can.

Many thanks, Paul

Wenn0101 commented 2 years ago

Closing. This is fixed in 2.2.0

Wenn0101 commented 2 years ago

Re-opening issue. Received internal bug report of a negative interaction between Servo and Serial.

After a servo write, Serial.available is no longer reporting received characters. Confirmed the issue with this sketch.

Confirmed that this sketch worked in v2.1.1 and not in v2.2.0. Suspect it will end up being related to the changes made here, will move to its own issue if I determine it is not.

//Tested on Artemis ATP
//Arduino 1.8.19, Artemis core v2.1.1 and v2.2.0
#include <Servo.h>

Servo myServo;

int pos = 1;
int dir = 1;

void setup()
{
  Serial.begin(9600);
  Serial.println("SparkFun Servo Example");

  myServo.attach(28);
}

void loop()
{
  if(pos == 0 || pos == 180) dir = dir*-1;
  pos += dir;  

  myServo.write(pos);
  delay(5);

  //Write back serial, fails in 2.2.0
  //Serial.println(Serial.available());
  while(Serial.available()){
    Serial.write(Serial.read());
  }

  delay(5);
}
paulvha commented 2 years ago

Weird... looked into this.

The OESTAT flag is set (overflow) and stays set. It does not seem to clear when reading the DR register.

reproduce I enter 2 bytes: 0x62 0xa ( B and Newline). Sometimes it goes well for 1 or 2 times, but most of the time it fails.

In the sketch loop(), before Serial.available(), I check for the flag register values. Once OESTAT becomes 1, I read the DR and print it. Once OESTAT is 1 in every next loop it keeps indicating the FIFO is not empty, the OESTAT stays 1 and with every read it keeps showing the 0x62.

To clear it I had added writing UARTn(0)->RSR_b.OESTAT = 1; At least NOW that clears the error and it that stops repeating (SO NO MATTER WHAT WE NEED TO ADD CLEAR THE ERROR CONDITION BY WRITING A 1)

BUT that does not solve the issue, because with nearly every keyboard input it generates an OESTAT error. Something else is going on. Using the standard sketch as you provided, and in serial_readable() I removed checking on OESTAT. if you enter a small word (asdfg ) and just do the small word a couple of times, it often happens that characters are lost.

I then tried the same standard sketch on V2.1.1. Get exactly the same errors when you enter a small word it often happens that characters are lost (On V2.1.1 the same as on V2.2.0).

Something going on with PWM setting. Maybe one of the system timers is set differently. Need more time to look at that.

regards, Paul

Wenn0101 commented 2 years ago

I am also seeing the missed characters in v2.1.1. I tried v1.2.3 and I still see missed characters, and OESTAT is set.

I think I am going to try to isolate this issue in a baremetal AmiqSuiteSDK project, to see if I get similar errors in a simpler example.

paulvha commented 2 years ago

The root cause is the following section in :

the file : apollo3/2.1.1/cores/arduino/sdk/core-implement/CommonAnalog.cpp, function : ap3_err_t ap3_pwm_output(ap3_gpio_pad_t pad, uint32_t th, uint32_t fw, uint32_t clk) Around line 572 :

    // if timer is running wait for timer value to roll over (will indicate that at least one pulse has been emitted)
    AM_CRITICAL_BEGIN // critical section when reading / writing config registers
    if ((segment == AM_HAL_CTIMER_TIMERA && *((uint32_t *)CTIMERADDRn(CTIMER, timer, CTRL0)) & (CTIMER_CTRL0_TMRA0EN_Msk)) ||
        (segment == AM_HAL_CTIMER_TIMERB && *((uint32_t *)CTIMERADDRn(CTIMER, timer, CTRL0)) & (CTIMER_CTRL0_TMRB0EN_Msk)))
    {
        uint32_t current = 0;
        uint32_t last = 0;
        do
        {
            last = current;
            current = am_hal_ctimer_read(timer, segment);
        } while (current >= last);
    }
    AM_CRITICAL_END // end critical section

If you comment out the lines : AM_CRITICAL_BEGIN and AM_CRITICAL_END. I am not losing any characters anymore and also the OESTAT error is not happening anymore.

I can understand that we would like a clean-cut over to the next PWM output, but the chance of any sketch/program updating the CTIMER while we read/wait for the cutover is extremely low. I would not expect this to be a critical section as we only read and would remove the 2 lines. Next to that, as mentioned earlier, we need to clear the OESTAT condition with a write(else it keeps repeating).

regards, Paul

paulvha commented 2 years ago

I wondered yesterday WHY commenting out AM_CRITICAL_BEGIN / AM_CRITICAL_END would make that difference. I guessed there is a problem with the FIFO handling. Looking into this for sure the FIFO is not enabled due to calls in serial_api.c, functions serial_init() and serail_format().

MBED_ASSERT(am_hal_uart_configure_fifo(obj->serial.uart_control->handle, &(obj->serial.uart_control->cfg), false) == AM_HAL_STATUS_SUCCESS);

In am_hal_usrt_configure(), the last 'false" is used as 'bEnableFIFO' to enable or disable the FIFO ((bEnableFIFO) ? AM_HAL_UART_FIFO_ENABLE : AM_HAL_UART_FIFO_DISABLE));

The same 'bEnableFIFO' is used to enable/disable TX and RX interrupts but then there need to be RX/TX buffers defined, which are not set in serial_init().

Changing the 'false' to 'true' in both locations the characters are not lost anymore, BUT the reading of the FIFO is only starting after 34 characters. So the interrupt seems to be only raised when the FIFO is full and not when one character is in the FIFO. Looking into that now.

regards, Paul

paulvha commented 2 years ago

The FIFO handling in the UART is still a mystery for me. From the remarks at the top of serial_api.c :

I have been reading the different UART registers as part of the loop. Clearly, in the FR register, the TXFE is set to zero as soon as I sent a byte (or actually 3 bytes: byte + \r + \n). BUT it does not generate an IRQ. Nothing in the EIS or MIS is changing. It seems to waits for the FIFO to become to the threshold before it raises an interrupt and then bytes from the FIFO are read. I have also changed in serial_init() setting the

obj->serial.uart_control->cfg.ui32FifoLevels = AM_HAL_UART_RX_FIFO_7_8; to obj->serial.uart_control->cfg.ui32FifoLevels = AM_HAL_UART_RX_FIFO_1_4;

Or any other value. The IRQ triggers earlier, but never after 2 or 3 bytes even with AM_HAL_UART_RX_FIFO_1_2 (which is setting IFLS to zero). Once there are enough bytes in the FIFO, it will read some of them, but not all although the FR register indicates there is the FIFO is not empty!

I have not figured out this FIFO. Maybe I am not alone, looking at V1.2.3 I see in aps_uart_cpp, function _begin(), line 350 : UARTn(_instance)->LCRH_b.FEN = 0; // Disable that pesky FIFO

I can read all the data from the FIFO with the following line in the sketch :

    if (UARTn(0)->FR_b.RXFE != 1){
      printf("%x \n", UARTn(0)->DR);
    }

Still (very) puzzled, maybe you have better ideas.

regards, Paul

Wenn0101 commented 2 years ago

Hey Paul,

Thanks for your input here. I wrote this driver and I am trying to remember everything that I was thinking at the time. I remember that I had significant problems getting the IRQ to trigger when I wanted it to. I committed the only solution that I found that would raise the interrupt when MBED was expected it. I think it is very possible that there is a better solution, but I think you are hitting the same confusion I did. I had difficulty using the FIFO and getting the interrupt to fire, I remember simple "echo" serial examples that would only echo characters back after 4 or 8 characters were submitted. This resulted in me thinking like the FIFO was only useful in situations where you were relying on polling the FIFO rather than utilizing interrupts. I cant help but feel like it has to be possible to configure the UART module to behave the way that we want, but that "// Disable that pesky FIFO" comment does make it seem like others before us have tried and failed.

To summarize this new issue, it seems like there are 3 pieces to this problem:

  1. OESTAT (Overrun Error STATus indicator) is not properly cleared.
  2. ap3_pwm_output contains a critical section that takes too much time, squashing time dependent interrupts.
  3. Serial is configured to not use the FIFO, making it very time dependent.

It sounds like I can quickly fix item 1, and item 2 may be a simple fix (although I want to make sure I understand why this is here, I think it might have something to do with "inverted PWM signals" that I remember being an issue, but that I didn't work on). But item 3 is still a bit of a problem that may not have a direct solution.

-Kyle

paulvha commented 2 years ago

Glad I am not the only one who does not understand the UART- FIFO. Looking at Ambiq SDK code the ' AM_HAL_UART_RX_FIFO_1_2' seems that the IRQ happens when the FIFO is half-full. I can't find a good description for the other settings. We could add a check (as part of available()) the RXFE status, if not 1 call the routine "serial_available" to read all the characters.. but it is only a work-around (and maybe not the best)..

As it relates to the ap_pwm_output. We should not remove the check, but in my mind it does not need to be AM_CRITICAL_BEGIN as am_hal_ctimer_read() is only reading. Not updating or writting. The worsed that would happen is that it misses a servo-cycle

regards, Paul

paulvha commented 2 years ago

Unfortunately even with the FIFO enabled.. it will not make it 100% either, but much better.

Change 1. I have add in ap3_pwm_output() setting a pulse , going HIGH right after AM_CRITICAL_BEGIN and going LOW just before :

AM_CRITICAL_BEGIN // critical section when reading / writing config registers
digitalWrite(27,HIGH);

    #here the check/waiting on the CTIMER to timeout

digitalWrite(27,LOW);
AM_CRITICAL_END // end critical section

The checking & waiting takes ~15ms each time.

Change 2. I have also added setting a pulse in HardwareSerial.cpp, function rxISR() to show every time a character is read and stored in the buffer:

while(UnbufferedSerial::readable()) {
    digitalWrite(23,HIGH);
    UnbufferedSerial::read(&c, 1);
    _rxbuf.store_char(c);
    digitalWrite(23,LOW);
}

Change 3. I have also added in serial_api.c, function serial_readable() a check on OESTAT. Not only to reset it, but also included a debug pulse :

    // ignore the character in error.
    UARTn(obj->serial.uart_control->inst)->DR;

    // clear pending OESTAT
    if (UARTn(obj->serial.uart_control->inst)->RSR_b.OESTAT) {
        am_hal_gpio_state_write(22, AM_HAL_GPIO_OUTPUT_SET);
        UARTn(obj->serial.uart_control->inst)->RSR_b.OESTAT = 1;
        am_hal_gpio_state_write(22, AM_HAL_GPIO_OUTPUT_CLEAR);
    }

Change 4. I have enabled setting the FIFO by changing in both serial_init() and serial.format() the settting to true:

MBED_ASSERT(am_hal_uart_configure_fifo(obj->serial.uart_control->handle, &(obj->serial.uart_control->cfg), true) == AM_HAL_STATUS_SUCCESS);

In the meantime I have figured out that the 5 different levels to trigger an IRQ, create the following tresholds: AM_HAL_UART_RX_FIFO_1_8 needs 4 characters (32 / 8) AM_HAL_UART_RX_FIFO_1_4 needs 8 characters (32 / 4) AM_HAL_UART_RX_FIFO_1_2 needs 16 characters (32 / 2) AM_HAL_UART_RX_FIFO_3_4 needs 24 characters (32 / 4 3) AM_HAL_UART_RX_FIFO_7_8 needs 28 characters (32 / 8 7)

Situation A The FIFO trigger is set for AM_HAL_UART_RX_FIFO_1_8. In ap3_pwm_ouput, AM_CRITICAL_BEGIN & AM_CRITICAL_END as is, NOT commented out. In the serial monitor of the ArduinoIDE I enter 40 characters, press Only the first 32 are shown. The rest is lost. If I enter only 5 characters it is not read until enough characters are entered for the FIFO IRQ to trigger.

Looking at the picture I see that the reading in rxISR() starts AFTER the AM_CRITICAL_END. With the first reading OESTAT is detected, which I reset (see change 3 above). Then it reads only the rest of data from the FIFO into the buffer. Critical_enabled

Situation B

The FIFO trigger is set for AM_HAL_UART_RX_FIFO_1_8. In ap3_pwm_ouput, AM_CRITICAL_BEGIN & AM_CRITICAL_END commented out. In the serial monitor of the ArduinoIDE I enter 40 characters, press The 40 characters are shown. If I enter only 5 characters it is not read until enough characters are entered for the FIFO IRQ to trigger.

Looking at the picture I see that the reading in rxISR() starts during the time the ap3_pwm_output() is waiting (the pulse is HIGH). NO OESTAT error. situationB

I have run both situations A and B multiple times. Sometimes I have been lucky that the rxISR() was handled while NOT in the ap3_pwm_output() waiting. Then it does not matter whether or not that waiting code is run as AM_CRITICAL.

Change 5. Next to the early mentioned changes, I have then added polling in HardwareSerial.cpp, available():

   if ( ! _rxbuf.available()){
        if (! _RXcheck)  rxISR();
    }

    return _rxbuf.available();

The RXcheck is volatile bool (I defined in HardwareSerial.h) which is set/cleared in rxISR() to prevent that IRQ & polling trigger double :

if ( _RXcheck) return;

_RXcheck = true;

while(UnbufferedSerial::readable()) {
    digitalWrite(23,HIGH);
    UnbufferedSerial::read(&c, 1);
    _rxbuf.store_char(c);
    digitalWrite(23,LOW);
}

_RXcheck = false;

Situation C ( with polling)

Same as B, only change 5 added The FIFO trigger is set for AM_HAL_UART_RX_FIFO_1_8. In ap3_pwm_ouput, AM_CRITICAL_BEGIN & AM_CRITICAL_END commented out. In the serial monitor of the ArduinoIDE I enter 40 characters, press The 40 characters are shown. BUT if I now enter even a single it is detected and shown as well.!!

Looking at the picture I see that the reading in rxISR() starts (and ends) during the time the ap3_pwm_output() is waiting (the pulse is HIGH). NO OESTAT error. You can also see the single pulse on the UART line, which is the result of polling. situationC

Conclusion.

Even with enabling the FIFO and keeping the AM_CRITICAL_BEGIN & AM_CRITICAL_END in ap3_pwm_output(), we are not save if the keyboard input is more than 32 characters. If a lower number is provided we have an excellent chance of NOT getting an OESTAT.

Still we need to add polling as the FIFO will only trigger IRQ above 4 characters, as set AM_HAL_UART_RX_FIFO_1_8.

Regards, Paul

Wenn0101 commented 2 years ago

Fix is in dev and scheduled for release 2.2.1

PaulZC commented 2 years ago

Thanks @Wenn0101

Just adding a link to @paulvha 's UART::end() fix:

https://github.com/sparkfun/OpenLog_Artemis/issues/117#issuecomment-1085881142