RIOT-OS / RobotFW-tests

Includes tests for RIOT based on the Robot Framework
GNU Lesser General Public License v2.1
4 stars 13 forks source link

tests: problems with ESP32 and tests/periph_uart #54

Open gschorcht opened 4 years ago

gschorcht commented 4 years ago

When I was testing PR #12955 as DUT with a nucleo-f103rb, I encountered a problem with tests/periph_uart. While all tests are working with ESP32 and the current RIOT master, the following tests fail with PR #12955 for the ESP32.

------------------------------------------------------------------------------
Extended Short Echo Should Succeed :: Verify echo of short string ... | FAIL |
[ t111 ] does not contain match for pattern 'u222'.
------------------------------------------------------------------------------
Extended Long Echo Should Succeed :: Verify echo of long string to... | FAIL |
[ t796676701934836782036789096074901985358332 ] does not contain match for pattern 'u8:7787812:4594789314789:1:7185:12:96469443'.
------------------------------------------------------------------------------

With the logic analyzer, no difference can be observed from the ESP32 side. The following snapshots show the reaction of ESP32 to the uart_write 1 flushcommand.

Figure 1: with current master screenshot_master Figure 2: with PR #12955 screenshot_deduplication

The signals in these snapshots are

Channel Signal name direction
2 UART0_RX from host to ESP32
3 UART0_TX from ESP32 to host
4 UART1_RX/DUT_TX from PHiLIP-n to ESP32
5 UART1_TX/DUT_RX from ESP32 toPHiLIP-n
6 DUT_RST from PHiLIP-n to ESP32

That is channel 3 and 5 are the signals generated by the ESP. The signal in channel 5 is the reaction of the ESP32 on the command uart_write 1 flush. In both cases, the test string flush is sent from ESP32 to PHiLIP-n. But in second case, PHiLIP-n doesn't answer with the incremented string.

Therefore, the uart_write 1 flush command is repeated after 2 seconds without any reaction from PHiLIP-n before PHiLIP-n restarts after a further second.

The only visible difference is that the ESP32 sends the test string on UART1 one character earlier relative to the output on UART0.

The complete captures from the logic analyzer: master.logicdata.zip deduplication.logicdata.zip

The Extended Short Echo test starts about 4 seconds after the first DUT_RST pulse in channel 6.

gschorcht commented 4 years ago

I can observe exactly the same behaviour when the uart_poweron() and uart_poweroff() cycle at https://github.com/RIOT-OS/RobotFW-tests/blob/9d418ddaa7474574057ba30c5bf72930007f2fba/tests/periph_uart/main.c#L151 isn't involved and the output on UART0 only contains the reaction on the uart_write 1 flush command.

Figure 1: with current master screenshot_master_wo_sleep

Figure 2: with PR #12955 screenshot_deduplication_wo_sleep

PHiLIP-n doesn't react on the test string.

gschorcht commented 4 years ago

@MrKevinWeiss Do you have any idea what the problem might be?

MrKevinWeiss commented 4 years ago

Did you try rerunning the tests a few times?

While all tests are working with ESP32 and the current RIOT master, the following tests fail with PR #12955 for the ESP32.

Does this mean all RF tests pass in master and not with #12955?

My initial thought is that the mode is not getting set in PHiLIP, which would be a PHiLIP communication problem but if the PR is reproducibly causing the failure then that doesn't make that much sense.

We do see that the extended mode (ie read a byte and add one to it) is not correct. I was thinking about removing the test actually since I only wanted to check if echo works and if not echo works (which is accomplished with the read register mode).

...Super cool that you are using this!

gschorcht commented 4 years ago

Did you try rerunning the tests a few times?

Yes.

While all tests are working with ESP32 and the current RIOT master, the following tests fail with PR #12955 for the ESP32.

Does this mean all RF tests pass in master and not with #12955?

Exactly.

...Super cool that you are using this!

It's a very cool tool. I really like having tools like that. I knew of its existence until @yegorich pointed it out to me in https://github.com/RIOT-OS/RIOT/pull/11418 and I first tried the Nightlies tab on Murdock.

The only question for me is what the consequences of HIL tests are. What are the penalties? At the moment it doesn't matter obviously when HIL tests fail for a platform. There seem to be a lot of HIL test that fail for different platforms.

However, my personal goal is that all HIL tests pass for ESP32. That is why I am currently testing so meticulously.

MrKevinWeiss commented 4 years ago

What are the penalties?

Currently, there are none aside from fixing bugs if we notice something in the nightly or if we are using it to test a PR.

There seem to be a lot of HIL test that fail for different platforms.

My idea (which sometimes differs from other) is to more make it a list of what is supported or not. There are some tests that may just not pass or work (for example some i2c tests on the stm32 boards). As well as adding tests to expose improper behavior without the requirement to fix everything (so we can have something like a todo list). I would like the output of a test to update a database of features supported and current benchmarks

I am hoping to simplify the management of a lot of this by adding some sort of regression that can also be used for benchmarks. So we only care if tests that were passing start failing.

Currently, it still seems like the majority of failures are from infrastructure issues (though it has been reduced quite a bit) or, possibly from the PHiLIP implementation. The worst is toggling because a bit was missed (this makes regression testing hard).