Closed japaric closed 4 months ago
I didn't have a chance to say this before we ran out of time in the meeting, but I'm very interested in this topic, specifically HIL testing. So far I haven't looked into the details, and I have some other stuff to get out of the way before I can work on this, but I'm definitely interested in anything that's going on in this space.
For bobbin-cli test
I implemented a simple text-based protocol similar to TAP. From the README:
[start] Running tests for frdm-k64f [pass] 0 [pass] 1 [pass] 2 [pass] 3 [pass] 4 [done] All tests passed
bobbin test recognizes [start], [pass] and [done] tags, exiting with return code 0. It also recognizes [fail], [exception], and [panic] tags, which will cause it to exit with return codes 1, 2 or 3. All other output is ignored.
The test runner will exit with return code 1 if there is a delay of more than 5 seconds between lines or 15 seconds to complete the entire test. In the future these timeouts will be configurable.
This has worked pretty well for me. It's simple to implement, you can capture panics and exceptions by having your handlers print a [panic] or [exception] line after any diagnostics, and it's easy to add additional output that the test runner will ignore just by using println!.
I chose the prefix tags specifically because they are easy to scan for visually and to grep for in logs. One can build a basic distributed CI system simply by piping the test output into syslog which is then dumped into a log aggregator.
I am also interested in assisting with this topic. I don't have anything to offer (except the blog post linked), but I would be happy to advise and assist once there is a direction to run with.
I'm working on adding support for the redpitaya logic analyzer to sigrok/pulsview (completed uio driver this morning / goal is to implement a tcp server over the weekend). I'm considering building an ice40 based test board that hooks up the la channels for the unit test and loads the unit tests via ftdi. Does this sound generally useful? Does this already exist?
One thing I have done some research on, in the past, is unit testing on the target. The opinions differ on that one, because most of the times the effort outweighs the benefits, but often that's the only reason that stands against it. So, I think it would be a good idea to keep that in mind while working on testing tools. With the help of traits, macros and maybe also compiler plug-ins it should be much easier to provide an easy to use and generic test framework, which could be used on host and/or target without too much additional work.
Due @japaric suggestion, I'd like to bring up to discuss the use of https://github.com/labgrid-project/labgrid for automated board test. It tries to control the board externally and has the capability to manage external resources required for some automated tests (e.g pressing buttons, power control, ...) of DUT. Obviously, it may require specific hardware for some type of tests but it seems worth taking a look for use or inspiration.
Lately, I've been exploring running Cortex-M programs in QEMU using two different approaches.
Using QEMU user emulation (qemu-arm
). This doesn't fully emulate a Cortex-M core; instead it works more like a translation layer that gives access to the host kernel to the Cortex-M program. Because it doesn't emulate a Cortex-M core you can't run cortex-m-rt programs or use instructions like WFI or access registers like BASEPRI (all these ops crash QEMU). However, you can execute pure (no I/O) Cortex-M machine code and use the host stdin, stdout and stderr. For more details see qemu-arm-rt
.
Using QEMU system emulation (qemu-system-arm
). With this approach QEMU fully emulates a Cortex-M core. You can run cortex-m-rt programs in this mode, instructions like WFI work (or at least don't crash QEMU) and registers like BASEPRI are properly emulated. Because the emulated program doesn't have access to the host it seems harder to script this for testing purposes. However, you can hook GDB to the emulated core and work from there your way up using Python, but I haven't really explored this angle. For more details see lm3s6965evb
.
It may be possible to leverage these approaches for testing Cortex-M programs w/o hardware but this will need more work and I won't have time to explore this further any time soon. So I'm sharing the info here hoping that someone else will continue to explore this area.
qemu-system-arm has a good support for semihosting, you can open, close, read, and write files on your host system, with it. You can even terminate a qemu session from the inside (which is great for automating tests), I've added a call for that in cortex-m-semihosting some year ago: https://docs.rs/cortex-m-semihosting/0.3.0/cortex_m_semihosting/debug/index.html
So it shouldn't be too hard to use qemu-system-arm for testing purposes. The only thing that is missing are FPU instructions, so you can only test M3 targets, but not M4.
There is also a wrapper tool qemu-system-gnuarmeclipse that I have come across, which is a wrapper written over qemu. It also has support for cortex m4 targets.
freertos_rs by @hashmismatch uses qemu-system-gnuarmeclipse for running unit test cases over cortex m4 without FPU.
Closing this old issue as part of today's issue triage. I think there's still lots of interest in different ways of doing embedded testing, but the WG isn't in a place to set down any standards.
There have been some interesting developments in testing, such as https://github.com/probe-rs/embedded-test as part of probe-rs, probe-run/knurling-rs's work on running on-target tests, and various adventures in HIL testing as part of CI, such as in Embassy, in the cortex-m crate, and in TockOS.
There are quite a few ways to do testing in embedded development. This blog post by @jamesmunns does a great job at describing the different approaches.
Right now Rust only provides testing support for
std
targets via the built-in#[test]
attribute and it's quite inflexible at supporting any other kind of testing framework. I tried and ended up creatingutest
ano_std
test runner that runs unit tests sequentially but it's a hack: the test runner breaks down if any unit test panics so you can't use the standardassert!
macros; instead you have to use the macros thatutest
provides.All this is going to change soon with the arrival of Custom Test Frameworks (CTF) -- see eRFC rust-lang/rfcs#2318. With CTF you'll be able to annotate test functions with custom attributes and perform a proc macro style transformation of the annotated items to generate a test runner program.
For example, a CTF could provide the following user interface:
And the whole crate procedural macro would generate the following test runner:
This opens the door to HIL test frameworks and more featureful no_std testing frameworks in general.
“Native” Host Testing
One route to testing
embedded-hal
drivers is to mock a implementation of the HAL (embedded-hal
) for x86_64 (or w/e the architecture of the build machine happens to be) and then test the drivers against the mock implementation on the build machine.I don't know if there's any sort of standard or best practice when it comes to mocking embedded I/O interfaces but, the other day, @idubrov showed a text based mock implementation of digital I/O that they used to test a LCD interface.
What other alternatives to mocking exist here?
HIL Testing
My understanding is that HIL testing involves (at least) two programs. One that contains the test suite and will run on the target device, and one that runs on some host machine that communicates with the target device and reports the results of running the test suite.
The target program can be generated using CTF from the crate under test while the host program can probably be just some Cargo subcommand / external tool. There's an interesting use case here where the target program may not fit in the target device memory if it contains all the unit tests; in this scenario it would be necessary to split the test suite in several target programs. It's unclear how to handle this with CTF; perhaps one should simply write the test suite as several files in the
examples
directory -- each file is a crate that will translate into an independent target program.Are there standards / best practices that we could use here? Should we use the Test Anything Protocol (TAP) for the device - host communication?
cc @nagisa @thejpster @jcsoo you probably all have thoughts on this topic