Consider simulation testing for math on 16-bit targets

workingjubilee commented 1 year ago

We have a few tier 3 targets that are 16-bit, AVR and MSP430. These have numerous issues but a number of them have been resolved recently. It's almost like you can really compile for them! Unfortunately these targets have almost no regression testing, as they are bare-metal, cross-compile-only targets. If we are serious about supporting them, it would be prudent to run tests. Fortunately, AVR has a sophisticated simulation testing ecosystem which several users report as reliable-enough for "the program builds and functions, and maybe even as-intended". However, as these are tier 3 targets, it would be somewhat of a policy change to add testing for much more than "you can technically build code for them ever".

On the other hand, the simulation testing in this case would probably be for bare minimums like "arithmetic and bitwise operators work". Embedded users are often willing to work around the inadequacy of std and the language, but it's hard to expect them to build around us if they don't have access to even basic computation primitives.

It may be surprising these targets need regression testing for such minimums, but nonetheless it's hard to expect them to be elevated past even tier 3 if we don't allow regression testing for those basics. On the other hand, if they break, very few compiler programmers will know how to fix them, and the policy of not running testing for tier 3 targets is there for a reason. So, this may need an MCP or something like that? I don't know. I'm just raising it as an issue here to, for now, track issues that either currently or would enjoy this. Or maybe this will just wait until we elevate support for these targets past tier 3.

https://github.com/rust-lang/rust/issues/77131
https://github.com/rust-lang/rust/issues/82242
https://github.com/rust-lang/rust/issues/82380
An apparent regression introduced by virtually no regression testing for AVR:
- https://github.com/rust-lang/rust/issues/102278
- https://github.com/rust-lang/rust/issues/76930

the8472 commented 1 year ago

A while ago there was some disussion about adding hello world or libs-only tests for tier2 targets where qemu is available. It wouldn't provide full test coverage but at least serve as a smoke test. #101703 was a first step in that direction.

Maybe for tier-3 they'd have to be allow-fail tests.

Rahix commented 1 year ago

Just a thought about this: Even just knowing that regressions have happened would be nice. I am in favor of not blocking anything on these targets but I think giving people a heads up that something breaks AVR and friends would be a great start.

Mark-Simulacrum commented 1 year ago

nonetheless it's hard to expect them to be elevated past even tier 3 if we don't allow regression testing for those basics

I am a little confused by this statement. Part of the tier policy is explicitly that raising to a higher tier is what "allows" putting additional burden on CI and maintainers, justified by greater commitment from the target maintainers and stricter requirements on the target itself.

In general, the policy we have today explicitly indicates that target maintainers (which don't exist for every target) are encouraged to run build and tests regularly. Basically, anything our CI doesn't cover should be covered externally, and fixes submitted as needed.

I personally am willing to interpret the tier 2 guarantees around testing in a way that includes tests which can be run without architecture-specific tooling. For example, the asm! macro has a number of tests for tier 2 and tier 3 targets that I believe are even no_core, explicitly to allow easily running those on our tier 1 platforms.

workingjubilee commented 1 year ago

I guess part of the reason I opened this issue is because I am

pretty sure the will is there to make such greater commitments in the embedded community, though it might take a little extra time to shake out who is drawing the short straw
pretty sure most targets don't regress on 2 << 2 == 8 or such, even w/o testing
pretty doubtful external tests are enough here, as most programmers are strongly biased towards assuming 64-bit (and with those aware other bitnesses exist, thinking mostly about 32-bit), so I don't know if running a nightly CI externally would be sufficient to keep up with problems introduced to 16-bit
pretty sure there is diagnostic utility in at least not having to do bisection for every single test fail externally

In other words I am questioning things a bit in the vein of "do the existing policies make sense for all targets?" As you noted, we have what might be considered exceptions to the rule. But they're exceptions that might make sense, and if they do, it is likely because they are testing fairly fundamental compiler functionality in these cases.

cr1901 commented 1 year ago

FWIW, mspdebug has an MSP430 simulator. I have batted around using that for testing my own code. But I would have no idea what integration into cargo test or compiletest would look like.

Patryk27 commented 1 year ago

Self-advertisement: I've written a tool that does that! -- https://github.com/Patryk27/avr-tester/.

It's a wrapper over simavr (which, well, simulates AVRs) and it can be easily used to test whether various stuff works as intended, in fully black-box tests - in fact, one of the first tests I've written there is an expression evaluator which allowed me to make sure that at least the math stuff works as intended when I was sending patches to compiler-builtins and whatnot.

The AVR program is: https://github.com/Patryk27/avr-tester/blob/3ea7049c813a4c12a0278f26112bb199fe07f69b/avr-tester-tests/xx-eval/src/main.rs

... and the Rust test: https://github.com/Patryk27/avr-tester/blob/3ea7049c813a4c12a0278f26112bb199fe07f69b/avr-tester/tests/tests/xx/eval.rs

... and what basically happens there is that the Rust test generates random expression trees (like (123 * 7) - 32), sends them to a simulated AVR, the simulated firmware then computes the result and sends back to the "host", which then compares its own evaluation result with the one retrieved from AVR.

Maybe it would come handy here?

cr1901 commented 1 year ago

Your avr-tester project is cool, but I was looking to try to use the same test function bodies on a std and no_std platform, and have "some bridge program that knows what to do when cargo test is called for a target without libstd." I.e. pass --target to cargo test".

For msp430, "knows what to do" would be "wrap the test function into a main ELF binary", and send that freshly-compiled ELF binary off to mspdebug's simulator`."

avr-tester seems to be more "simulate peripherals in the context of cargo test running on a target with std (i.e. don't pass --target to cargo test").

cr1901 commented 1 year ago

Apropos of a crate I've been writing, I suspect I found an arithmetic bug on msp430. When called, the following function shift_test panics in both release and debug mode for target=msp430-none-elf, but passes for my host x86_64-pc-windows-gnu (and presumably others with a multiple-bit shift insn):

fn do_shift(val: &u64, amt: &u8) -> u64 {
    *val << *amt
}

fn shift_test() {
    let my_u64: u64 = 1;
    let my_shift: u8 = 0;

    if black_box(do_shift(black_box(&my_u64), black_box(&my_shift))) != 1 {
        panic!()
    }
}

I've been writing a tool to make msp430 debugging easier, and have been using mspdebugs simulator effectively to create a test case:

PS C:\msys64\home\William\Projects\MCVE\msp430-shl> cargo run --release --target=msp430-none-elf -Zbuild-std=core
    Finished release [optimized + debuginfo] target(s) in 0.15s
     Running `msprun sim gdb target\msp430-none-elf\release\msp430-shl`
C:\msys64\opt\toolchains\bin\msp430-elf-gdb.exe: warning: Couldn't determine a path for the index cache directory.
Reading symbols from target\msp430-none-elf\release\msp430-shl...
Remote debugging using localhost:2000
0x0000ffff in __RESET_VECTOR ()
Erasing...
Loading section .text, size 0x20c lma 0xc000
Loading section .rodata, size 0x34 lma 0xc20c
Loading section .vector_table, size 0x20 lma 0xffe0
Start address 0x0000c000, load size 608
Transfer rate: 21 KB/sec, 202 bytes/write.
(gdb) c
Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.
rust_begin_unwind (_info=0x3ce) at src/lib.rs:21
21              msp430::asm::barrier();
(gdb) bt
#0  rust_begin_unwind (_info=0x3ce) at src/lib.rs:21
#1  0x0000c116 in core::panicking::panic_fmt (fmt=...) at src/panicking.rs:64
#2  0x0000c146 in core::panicking::panic (expr=...) at src/panicking.rs:114
#3  0x0000c0d6 in msp430_shl::shift_test () at src/main.rs:44
#4  msp430_shl::k707o76630f9enn0::k707o76630f9enn0 () at src/main.rs:28
#5  main () at src/main.rs:19

Perhaps a driver program/library could be useful for integrating into the #[test] framework. Is there any precedent for running cargo test for targets using a simulator?

On the other hand, the simulation testing in this case would probably be for bare minimums like "arithmetic and bitwise operators work".

Erm, well... they mostly work on MSP430 :P. I lost many hours to this bug. It caused a deserialization routine to fail when the result didn't fit into a u16 thanks to a bit in the high 6 bytes being set incorrectly by the "shift-by-0"...

rust-lang / rust

Consider simulation testing for math on 16-bit targets #107612