lowRISC / opentitan

OpenTitan: Open source silicon root of trust
https://www.opentitan.org
Apache License 2.0
2.59k stars 782 forks source link

[chip-test] chip_sw_uart_system_loopback #19867

Closed crteja closed 9 months ago

crteja commented 1 year ago

Test point name

chip_sw_uart_system_loopback

Host side component

Rust

OpenTitanTool infrastructure implemented

Yes

Silicon Validation (SiVal)

Yes

Emulation Targets

Contact person

@rswarbrick

Checklist

Please fill out this checklist as items are completed. Link to PRs and issues as appropriate.

estimate 2

jwnrt commented 10 months ago

Here's a minimal-ish repro of a bug with the system loopback that we found during testing:

uart_loopback.c (expand) ```c // Copyright lowRISC contributors. // Licensed under the Apache License, Version 2.0, see LICENSE for details. // SPDX-License-Identifier: Apache-2.0 #include "sw/device/lib/arch/device.h" #include "sw/device/lib/base/mmio.h" #include "sw/device/lib/dif/dif_uart.h" #include "sw/device/lib/runtime/hart.h" #include "sw/device/lib/testing/test_framework/check.h" #include "sw/device/lib/testing/test_framework/ottf_main.h" #include "hw/top_earlgrey/sw/autogen/top_earlgrey.h" #include "uart_regs.h" OTTF_DEFINE_TEST_CONFIG(); bool test_main(void) { dif_uart_t uart; CHECK_DIF_OK(dif_uart_init( mmio_region_from_addr(TOP_EARLGREY_UART1_BASE_ADDR), &uart)); CHECK_DIF_OK(dif_uart_configure( &uart, (dif_uart_config_t){ .baudrate = (uint32_t)kUartBaudrate, .clk_freq_hz = (uint32_t)kClockFreqPeripheralHz, .parity_enable = kDifToggleDisabled, .parity = kDifUartParityEven, .tx_enable = kDifToggleEnabled, .rx_enable = kDifToggleEnabled, })); // Check the `RXLVL` before the loopback is enabled. size_t available = 0; CHECK_DIF_OK(dif_uart_rx_bytes_available(&uart, &available)); LOG_INFO("%d bytes available before setting loopback", available); // Set the `SLPBK` (System loopback) bit to enabled. uint32_t reg = mmio_region_read32(uart.base_addr, UART_CTRL_REG_OFFSET); reg = bitfield_bit32_write(reg, UART_CTRL_SLPBK_BIT, true); mmio_region_write32(uart.base_addr, UART_CTRL_REG_OFFSET, reg); // Alternatively: // CHECK_DIF_OK( // dif_uart_loopback_set(&uart, kDifUartLoopbackSystem, kDifToggleEnabled)); // Check the `RXLVL` immediately after the loopback is enabled. CHECK_DIF_OK(dif_uart_rx_bytes_available(&uart, &available)); LOG_INFO("%d bytes available immediately after setting loopback", available); // Delay by some amount and check the `RXLVL` again. busy_spin_micros(10); CHECK_DIF_OK(dif_uart_rx_bytes_available(&uart, &available)); LOG_INFO("%d bytes available some delay after setting loopback", available); uint8_t byte; CHECK_DIF_OK(dif_uart_byte_receive_polled(&uart, &byte)); LOG_INFO("byte in fifo: %02x", byte); return true; } ```

Running this in the fpga_cw310_sival_rom_ext execution environment reliably triggers the bug. It does not trigger as often (if at all) with the test_rom or with UART0.

Here's the output:

I00001 ottf_main.c:154] Running sw/device/tests/uart_smoketest.c
I00002 uart_smoketest.c:34] 0 bytes available before setting loopback
I00003 uart_smoketest.c:40] 0 bytes available immediately after setting loopback
I00004 uart_smoketest.c:44] 1 bytes available some delay after setting loopback
I00005 uart_smoketest.c:48] byte in fifo: f0
I00006 ottf_main.c:109] Finished sw/device/tests/uart_smoketest.c
I00007 status.c:28] PASS!

The code is initializing UART1 and printing the RXLVL register field which starts at 0. The system loopback is then enabled, and the RXLVL is printed again as still 0. After some delay, there is suddenly a byte in the FIFO. In this run it was f0 but we've seen 80 and ff too.

The bug can also be triggered by modifying the normal smoketest to use UART1 and running it in the fpga_cw310_sival_rom_ext environment. That test is meant to loop through:

'S', 'm', 'o', 'k', 'e', ' ', 't', 'e', 's', 't', '!'

but instead we see:

'M', 0xff, 'm', 'o', 'k', 'e', ' ', 't', 'e', 's', 't', '!'

It's as if the S is shifted across by a few bits to make an M and then we get some corrupted 0xff byte afterwards.

rswarbrick commented 10 months ago

I had to look stuff up in an ascii table (and other people might need to do so too!), and here are the translations:

  S = 0101 0011
  M = 0100 1101

So the change might be a shift by two places (if we're going from MSB, start 'S' two bits late and you get 'M')

Obviously, this doesn't explain the 0xff byte that follows.

rswarbrick commented 10 months ago

I wondered whether we could reproduce this with the block-level DV. My idea was that it might be that the extra junk that you are seeing gets pushed off the end of a FIFO before any checks happen. But I was wrong!

I bodged the existing uart_loopback test so that it ran a system loopback, then ran it and looked at the waves. Everything looks sensible, I think. Which is a bit of a pity! I was hoping to find a root cause!

@jwnrt: Would it make sense to try to run the minimal test as a system test in simulation? I'm hoping we might get some sensible wave files that we could delve through to figure out what has happened.

jwnrt commented 9 months ago

After speaking with @alees24 I have a better idea of what the issue is.

It's probably bits coming in from the RX line out to pinmux, so when the system loopback is enabled that could be a start bit (0) and generate a phantom byte. The solution is probably just to document that RX should be disabled when enabling the loopback. We may also be able to modify the pinmux_init function that the ROM_EXT (and ROM?) runs which currently only configures UART0 so that it muxes the RX lines of UART1, UART2, and UART3 to the constant 1 pad. That would mean even if bits enter from the RX line, they won't be start bits.

Bits entering from the line out to pinmux also explains why we see this with UART1, UART2, and UART3 but not UART0 (which is pinmuxed to the external serial by ROM_EXT) and not in SIM_DV (which pinmuxes all four UARTs in most tests).

rswarbrick commented 9 months ago

Oh! That makes sense. Thanks, @alees24, for getting to the bottom of things.

alees24 commented 9 months ago

When enabling 'system loopback' (RX FIFO input driven from TX FIFO output) within the UART, the RX path needs to be in a known state; either set it to constant 1 using pinmux or disable the RX side, and - pedantically - wait for > 10 bit intervals at the current receive baud rate, then flush out the read path eg. empty FIFO, clear error conditions etc.

Similar concerns affect the enabling of 'line loopback' (echoing of line RX on the TX output line) since the TX side could conceivably be in the process of transmitting, although the additional complication in this case is that there could also be data arriving from the RX line at the point of switching, and that's much harder to accommodate.

(Be aware that in chip sim only UART0's pins actually have a DPI model attached within the test bench, although the uart_agents ought to be operable, providing that there is no conflict on the relevant Multiplexed I/O pins.)