Closed japaric closed 5 years ago
I think this relates to #47.
@jcsoo and I discussed this in the last meeting, I think it comes down to a balance of what is the cost to develop and maintain, and what value it could bring.
Using the terms from my blog post, I would make the following general assessment:
Note: For all of this discussion, I am only considering the Rust Language, and not necessariliy considering the crates and frameworks (
embedded-hal
,rtfm
) unless otherwise specified. I think discussing those topics are fit for a post of their own in the future.
Can a set of basic Embedded Rust applications still compile on every stable/beta/nightly version of the Rust Compiler?
I think the effort of development here is decently low. We need to:
Hopefully very low. I would imagine as certain regressions are identified, test cases could be added as necessary.
This has value to help illustrate Rust's commitment to the embedded sphere of development. This would help catch problems that would be experienced immediately by developers when upgrading versions of rustc
, and would catch some problems we have experienced in the past with regards to regressions and changing unstable features.
We should almost definitely do this in my opinion
I don't think this has immediate value. Most *-host testing is verifying that application or driver code is sane, independent of the target. This might be different when discussing the embedded-hal
and driver crate ecosystem.
I imagine this would be used to continuously build sample binaries for selected embedded targets, and verify that they still function correctly when flashed.
It would be necessary to do the following:
The items listed above a decently sized, but there are many examples of how to do this out there. I know projects like https://github.com/RIOT-OS/ maintain these kinds of systems already.
A big "cost" here is that this approach requires physical hardware (both the embedded systems, and some kind of test rack to drive and monitor the tests). This is an addition to the Rust Language's testing infrastructure, which is entirely cloud-based at the moment as far as I know.
I expect the effort to maintain to perhaps be prohibitive here. A person will need to be physically located where the hardware is to replace any broken components (boards, power supplies, test rack failure), and troubleshoot any intermittent or persistent failures.
The more advanced the testing is (e.g. with logic analyzers, mocked hardware, etc), the more likely the chance of hardware failure becomes, and the harder this testing is to maintain.
This infrastructure could be decentralized and provided by donors/volunteers willing to host hardware that is relevant to them, however this would increase the development effort to support a decentralized testing system, and would rely on timely responses by specific volunteers for consistent results.
The value here is also very high. We could say with strong confidence that Rust will not break for your platform if it is one of the tested ones. This could tie in to an LTS style guarantee. I have already heard talks about third party companies willing to sell "maintenance guarantees" to companies that use Rust and need a "guaranteed LTS" experience.
This infrastructure would be super neat outside of the rust ecosystem too, I'd love to have shared / distributed hardware resources for driver and hardware testing (hardware ci is something I've been dreaming of for a while now)
Once concept we've been discussing recently is dev kits or peripherals paired with an RPi that manages tests and provides standard peripherals to test against / use in testing. Then there'd be a runner on the device which advertises capabilities to and queries for jobs from a queue service, spins up containers with appropriate peripherals mapped to the container space and runs the jobs, then fires the outputs back to the queue service. Jobs would be a travis-ci like config paired with a pre-compiled assets (to be language independent / reduce needless load on hwci components), and we could have a repo for runner and hardware configurations to make it easy to repeat and understand physical setups. Then a GitHub/GitLab/whatever integrations would then interact with the queue service for scheduling etc. It's another thing I'd be interested in working on, but, there'd need to be some kind of team / commitment to supporting it (and tbqh I wouldn't choose to do it in rust).
For now, gitlab-ci is probably worth a look, the runner does everything above already and it can be configured to do something like this. It's probably not exactly what we're looking for to work outside just one project, but could be enough for now to mirror things to a gitlab organisation and run tests with tagged workers via that.
Hey @ryankurte, I'm actually familiar with GitLab CI, as I use it at my current work.
I know groups like RIOT-OS have tools that do this (See https://github.com/RIOT-OS/murdock-scripts), and I have personally developed non-distributed systems in the past that do what you describe: Integrate some kind of host with some kind of embedded client (either a dev board, unit under test, etc), usually with some additional hardware (sometimes another dev board, or logic analyzer, FT2232H, etc) to read state. Unfortunately none of the tools I have developed in the past have been open source, so I don't have much to share other than my experiences.
Its a solvable problem, but for Rust specifically, the biggest issue is "who stores and maintains the hardware"? Especially if the CI tests are important enough to prevent releases when regressions occur? The second biggest issue would be "How do we integrate any testing we develop into the main RustLang CI process"?
Please do stay in touch! Just because Hardware in the Loop testing isn't something we necessarily want to tackle today, doesn't mean we wont in the future! If we close this issue with just CI (no hardware) testing, I'll make sure we open up a follow on issue to keep this on the record.
Also paging @jcsoo since he has expertise in this area.
What about CI systems developed specifically for linux distributions? fedora, nixos, guix all have built their own CI's specifically for testing the distribution on multiple hardware platforms. Maybe they are too distribution / linux specific to be adapted to generic / embedded targets, I don't know.
If you just want to plug in dev boards and load stuff onto them as-is, you need to have someone willing to set up and host all of the host systems, CI infrastructure, and physical hardware, not to mention setting up an isolated network and VPN for others to access it. This is all running in someone's office or lab, so it's not a case of running scripts in AWS. Someone will also have to debug these systems as well as the hardware if things go wrong, so they will need a fairly broad base of experience.
Once you get past basic testing of the handful of peripherals on these dev boards (some of which may have no on-board peripherals at all), you need to build actual systems on top of these boards, and these systems need to be reliable and reproducible which means that you don't want breadboarded prototypes. You also need external systems to generate inputs and measure outputs, especially if you are working on a network stack (USB, CAN, Ethernet, Serial, Bluetooth).
So, running an embedded CI lab is enough work and expense that I don't think anyone will do it on a pure volunteer basis. Certainly individuals or groups that have strong incentives to get things tested will put in the effort, but I'm not sure that this WG or even the Rust language organization is quite that motivated.
We might be better off with a more distributed approach. Dev boards are generally not that expensive, and setting up a Raspberry PI is not too difficult for individuals. I think it would be very useful to have a set of common dev boards from a variety of vendors that we can use as "reference" boards, as well as a set of peripherals of various types that can easily be connected to these boards for testing.
It's not important that every developer has every board, and this isn't meant to restrict the environments where Rust embedded development will happen; the intent is to make it a bit more likely that developers overlap in what they build and test on. Bug reports that can be reproduced by at least one other person are a lot more likely to be useful.
So for the just CI option as mentioned in your post @jamesmunns, are we really just taking about architecture level like msp430
or cortex-m4
?
Because in that case could we create a generic project with some basic internal tests (can we add stuff, do we have working atomics etc., whatever else is important) and compile and run them for each architecture in qemu without a whole lot of work?
Several years ago we embarked on a project at Mozilla to use PandaBoards (a TI ARM development board) for Android and Firefox OS testing. The end result of that was mozpool.
We had at one point several hundred PandaBoards installed in custom-built rackmount chassis in a datacenter and were running tests on them in Firefox CI. There's a bunch of extra complexity in mozpool because we didn't want to have a 1:1 host machine to PandaBoard ratio, so we figured out how to get them to PXE boot into a minimal Linux environment from which they could flash an Android or Firefox OS image. For most microcontrollers I suspect it'd be simpler to just have the device connected via USB to the host machine.
I don't know that anything in mozpool is directly usable for this effort, but the code there was used successfully in production CI, so if nothing else there may be some useful design lessons to be learned.
That effort was abandoned partially because we abandoned Firefox OS and partially because testing Android against a dev board doesn't provide much value in reality (because actual Android users are using vastly different hardware). We now mostly run Android tests in emulators, but also we run a subset of tests on real phone hardware using a project called autophone. I'm pretty sure the mozpool phones mostly live at a remote employee's house, and there's definitely some care and feeding required to keep things running smoothly.
I should note that mozpool was overdesigned and overcomplex for what we ended up needing for Android CI. And poved to be pretty cumbersome due to the fault levels of the boards in question and the implemented solutions frequently involved humans to touch and reflash them by hand. (even though not all flashing was needed to be done by humans)
For some further reading:
Triage:
The extending Rust reach participants mentored by @jamesmunns are working on this. Among their goals they have:
main
).In their next meeting they'll discuss which of these goals are must have for edition and which ones are stretch goals, and a potential timeline.
Adding binary size regression tests. The binary size of a program, compiled with opt-level=s, is tracked over time (per PR) and the CI fails if the binary size of the program regresses.
FYI, we wrote a little Rust tool for use in Firefox CI for tracking binary size more precisely by section size, you might find it useful: https://github.com/luser/rust-size .
@luser I find cargo bloat a bit more useful for that because it points in which function the change was.
@therealprof I think this tools have a different use case. cargo-bloat output is for human consumption, but this tool is for automated scripts (I don't think scripts care about which exact function changed).
@pftbest If you want to flag regressions, it's typically very useful to point out where they happened. I'm using cargo-bloat
manually to track regressions in the code generation/libcore over different rustc versions for my MCU crates, e.g. https://github.com/therealprof/microbit/blob/master/tools/capture_example_bloat.sh
Now with a little bit of hacking and the storage of the previous successful build result this could be automated and even produce the assembly output of the previous vs. regressed build for manual inspection... That's what I would expect to see but of course YMMV.
Answering https://github.com/rust-lang-nursery/embedded-wg/issues/129#issuecomment-407087928 here: cc @jamesmunns
We should definitively have a link test of a cortex-m-rt program. The linker script used by cortex-m-rt has assertions that check the validity of the memory layout of the program. This reduces the need for inspecting the produced binary.
We should also test a few variations of the cortex-m-rt program. For example linking the program to libm.a should not produce "duplicate symbol" errors. We should test that both linking to a #[panic_implementation]
provider crate, and defining the #[panic_implementation]
in the top / leaf crate itself both work work. And ... I can't think of any variation at the moment :-).
Lately, I've been exploring running Cortex-M programs in QEMU using two different approaches -- I have shared my findings in https://github.com/rust-lang-nursery/embedded-wg/issues/47#issuecomment-408550563. The IRR folks may be interested in continuing to explore QEMU for testing Cortex-M programs.
CC @nerdyvaishali and @sekineh. The comment from @japaric, as well as the discussion from the last month or so may be interesting for you.
These would correlate with our tracking issue https://github.com/jamesmunns/irr-embedded-2018/issues/3.
@jamesmunns we haven't had time to check on this during the meetings. Any news on this front?
Also, @pftbest mentioned that one can use semihosting from within QEMU to interact with the host (use stdout, open / read / write files) in https://github.com/rust-embedded/wg/issues/47#issuecomment-410375474. Haven't tried myself but sounds like it could be used to write some tests.
@japaric at the moment @sekineh has https://github.com/rust-lang/rust/pull/53190 open, which adds compilation of the cortex-m
crate on the four major thumb targets. We are hoping to get that merged this week, and will begin looking at the linking suggestions you made in https://github.com/rust-embedded/wg/issues/52#issuecomment-408550942.
The last piece landed in rust-lang/rust#53996
Triage
2018-07-02
The extending Rust reach participants mentored by @jamesmunns are working on this. Among their goals they have:
main
).In their next meeting they'll discuss which of these goals are must have for edition and which ones are stretch goals, and a potential timeline.
Last year, compilation of the core crate broke for MSP430 / thumbv6m-none-eabi twice due to some changes to libcore source code. That problem could have been avoided if the core crate was compiled for thumbv6m-none-eabi / msp430 as part of rust-lang/rust test suite. Last year there was also quite some breakage in the embedded ecosystem due to the compiler work on incremental compilation and parallel codegen.
We should try to get rust-lang/rust test suite to include tests for embedded target. At the very least core should be compiled for embedded targets.
TODO