mmurdoch / arduinounit

ArduinoUnit is a unit testing framework for Arduino libraries
MIT License
394 stars 51 forks source link

Flash string merging gets messed up #90

Closed matthijskooijman closed 1 year ago

matthijskooijman commented 3 years ago

TL;DR: While writing this issue, I found the cause of the issue, see the end of this post. I'll submit a PR next.

I just came back to a project using arduinounit, uploading a test suite I made earlier and then saw that the serial output was confusing: All messages about assertions said "passed" even though they seem failed.

The default basic.ino example gives me this output (on a custom ATmega2560 board, with both the 1.8.2 and 1.8.3 AVR core and arduinounit master):

Test correct passed
Assertion passed: (x=1) != (1=1), file /home/matthijs/docs/MKIT/3Devo/AtTinyBootloader/BootloaderTest/Test/Test.ino, line 12.
Test incorrect failed
Test summary: 1 passed, 1 failed, 0 skipped, out of 2 test(s).

Adding some debug output shows that the test state variable is actually correct, and the right string is being printed, so I suspect that the string merging is somehow messing up, as done in https://github.com/mmurdoch/arduinounit/blob/b687ea4e797654c8270c751c9fb22352028a6a15/src/ArduinoUnitUtility/Flash.h#L11

This is confirmed by redefining ARDUINO_UNIT_PSTR to just use avr-libc standard (non-merging) PSTR or removing the ARDUINO_UNIT_STRING macro from the print, i.e.:

-      Test::out->print(ok ? ARDUINO_UNIT_STRING("passed") : ARDUINO_UNIT_STRING("failed"));
+      Test::out->print(ok ? "passed" : "failed");

With either of those changes the output becomes correct again:

Test correct passed
Assertion failed: (x=1) != (1=1), file /home/matthijs/docs/MKIT/3Devo/AtTinyBootloader/BootloaderTest/Test/Test.ino, line 12.
Test incorrect failed
Test summary: 1 passed, 1 failed, 0 skipped, out of 2 test(s).

Looking at the preprocessed output (i.e. adding -E to the gcc commandline), I see this is being generated:

      Test::out->print(ok
                       ? (reinterpret_cast<const __FlashStringHelper *>((__extension__({ 
# 518 "/home/matthijs/docs/Electronics/Arduino/Sketches/libraries/arduinounit/src/ArduinoUnit.h" 3
                        const char * 
# 518 "/home/matthijs/docs/Electronics/Arduino/Sketches/libraries/arduinounit/src/ArduinoUnit.h"
                        ptr__COUNTER__; asm volatile ( ".pushsection .progmem.mergeable-strings, \"SM\", @progbits, 1" "\n\t" "0: .string " "\"passed\"" "\n\t" ".popsection" "\n\t" ); asm volatile ( "ldi %A0, lo8(0b)" "\n\t" "ldi %B0, hi8(0b)" "\n\t" : "=d" (ptr__COUNTER__) ); ptr__COUNTER__ ; })))) :
                       (reinterpret_cast<const __FlashStringHelper *>((__extension__({ 
# 519 "/home/matthijs/docs/Electronics/Arduino/Sketches/libraries/arduinounit/src/ArduinoUnit.h" 3
                      const char * 
# 519 "/home/matthijs/docs/Electronics/Arduino/Sketches/libraries/arduinounit/src/ArduinoUnit.h"
                      ptr__COUNTER__; asm volatile ( ".pushsection .progmem.mergeable-strings, \"SM\", @progbits, 1" "\n\t" "0: .string " "\"failed\"" "\n\t" ".popsection" "\n\t" ); asm volatile ( "ldi %A0, lo8(0b)" "\n\t" "ldi %B0, hi8(0b)" "\n\t" : "=d" (ptr__COUNTER__) ); ptr__COUNTER__ ; })))));
      Test::out->print((reinterpret_cast<const __FlashStringHelper *>((__extension__({ 
# 520 "/home/matthijs/docs/Electronics/Arduino/Sketches/libraries/arduinounit/src/ArduinoUnit.h" 3
                      const char * 
# 520 "/home/matthijs/docs/Electronics/Arduino/Sketches/libraries/arduinounit/src/ArduinoUnit.h"
                      ptr__COUNTER__; asm volatile ( ".pushsection .progmem.mergeable-strings, \"SM\", @progbits, 1" "\n\t" "0: .string " "\": \"" "\n\t" ".popsection" "\n\t" ); asm volatile ( "ldi %A0, lo8(0b)" "\n\t" "ldi %B0, hi8(0b)" "\n\t" : "=d" (ptr__COUNTER__) ); ptr__COUNTER__ ; })))));

So it actually seems that __COUNTER__ is not being replaced. This also explains why not all strings are messed up, just the ones within the same function since ptr__COUNTER__ probably ends up as a local variable in functions.

I suspect that some additional machinery is needed to make sure that __COUNTER__ is actually expanded. Normally, I think, macros within macros are not expanded, except when they contain a macro argument (at least when concatenation is involved), this is why concatenation macros are always use multiple levels). It seems that __COUNTER__ needs the same treatment: https://stackoverflow.com/a/27611869/740048

I do wonder why this worked before (I also saw the problem with an older version of arduinounit which I'm pretty sure worked properly before, but maybe I rebased without checking...). Maybe older versions used different scopes for each __extension__ block or so?

However, I tried fixing this with additional macros and then got more errors because each invocation of __COUNTER__ actually gives a separate number, so the definition and references to the variable will be different and thus break.

It seems that the __COUNTER__ approach was introduced in 60840b85adbedb40bce08b3fce836c8d2186c6cf to pretty much solve exactly this issue. However, for the test status (rather than assertion status) is actually fixed by no longer using ARDUINO_UNIT_STRING making me believe that maybe the __COUNTER__ approach did not work and another approach was taken to fix the test status.

So, I added some additiona macros to expand __COUNTER__ only once, so all parts refer the same ptr variable, and the problem still exists. So it turns out that the ptr variable was actualy local to the extension block and was resolved properly, but somehow the string reference that's loaded into the variable is not.

I see that the string address uses a fixed label 0: to be loaded from. This looks like it should work, since numerical labels do not need to be unique and the 0b reference means "get the nearest label 0, going backwards", so that should get the right value. But somehow it ends up using the wrong label, but I'm not sure.

However, since I just got the __COUNTER__ stuff working, we can just use that to make the labels unique. I tried that and it seems to work, so maybe that's a good fix? I'll submit a PR in a minute.

matthijskooijman commented 3 years ago

Looking at this a bit more, I suspect that the problem might be caused by gcc reordering asm blocks, which it is apparantly allowed to do: https://gcc.gnu.org/legacy-ml/gcc-help/2017-10/msg00061.html

That thread also has some suggestions to prevent reordering by adding memory barriers (which I'm afraid would kill performance by forcing reloads of all memory across them), or by adding dependencies (which I think would keep the asm blocks in order, but not prevent inserting another one in between, which is probably the problem here), so just using unique labels seems to be a better approach.

I created a PR with this fix here: #91.

wmacevoy commented 1 year ago

Ok I think the accepted PR fixed this, yes?

matthijskooijman commented 1 year ago

It has been a while, but yeah, I think so, I just forget to reference this issue in the commit. Closing.