Closed x37v closed 7 years ago
I have seen this before though haven't nailed down the exact cause.
From what I have seen:
thumbv6m-none-eabi
targetOut of curiosity, what happens when you compile without --release
but with LTO enabled (e.g. xargo rustc -- -C lto
)? That should produce a single object file so linking the object file should be straightforward. I expect that LLVM might error in that scenario.
cc @therealprof, who may know more about this problem
Hm, I thought I opened a bug report for this before or at least mentioned it somewhere...
The problem is not debug related, I can easily reproduce it with --release
builds, too. It happens when binary code grows so large that the 11 bit offsets available to the branch instruction are not sufficient to reach the jump target anymore.
I've no idea how (and where) this could be addressed but monomorphisation, heavy inlining, and LTO (as well as lack of optimisation in debug builds) are the source of the issue here because they all lead to few but huge functions.
Looking into the instruction set it seems that Cortex-M should happily support the larger version of the branch as well. Maybe it would suffice to tell that to the linker somehow...
@therealprof a bit disheartening that you get this problem with --release
as well as I hope to use this for something that I'll share with others.. though, glad to see that there is hope!
@japaric xargo rustc -- -C lto
gave the same error:
error: linking with arm-none-eabi-ld
failed: exit code: 1
|
= note: "arm-none-eabi-ld" "-L" "/home/alex/.xargo/lib/rustlib/thumbv6m-none-eabi/lib" "/home/alex/projects/modular/threshpan/target/thumbv6m-none-eabi/debug/deps/stm32f0308_disco_rust-60ecd4ad81e058b7.0.o" "-o" "/home/alex/projects/modular/threshpan/target/thumbv6m-none-eabi/debug/deps/stm32f0308_disco_rust-60ecd4ad81e058b7" "--gc-sections" "-L" "/home/alex/projects/modular/threshpan/target/thumbv6m-none-eabi/debug/deps" "-L" "/home/alex/projects/modular/threshpan/target/debug/deps" "-L" "/home/alex/projects/modular/threshpan/target/thumbv6m-none-eabi/debug/build/cortex-m-rt-a3623460a99781ee/out" "-L" "/home/alex/.xargo/lib/rustlib/thumbv6m-none-eabi/lib" "-Bstatic" "/home/alex/.xargo/lib/rustlib/thumbv6m-none-eabi/lib/libcompiler_builtins-ad42e860445b13d0.rlib" "-Tlink.x" "-Bdynamic"
= note: /home/alex/projects/modular/threshpan/target/thumbv6m-none-eabi/debug/deps/stm32f0308_disco_rust-60ecd4ad81e058b7.0.o: In function WWDG': stm32f0308_disco_rust.cgu-0.rs:(.text+0x0): relocation truncated to fit: R_ARM_THM_JUMP11 against
DEFAULT_HANDLER'
BTW, if you want me to move this over to cortex-m-rt I can do that.. I suspected that I might be posting to the wrong specific location in the correct ecosystem..
@japaric I figured out what the problem is... svd2rust
Seemingly the default CPU model for the armv6-m architecture is broken. I tried various options including using the -mcpu=cortex-m0 and -mcpu=cortex-m3 options on the generated assembly and the latter automatically changes branches where the target doesn't fit into the available 2 bytes into the 4 bytes form of the branch while the default model and -mcpu=cortex-m0 (which actually might be the default model) keeps it as-is causing the linker to barf.
However, if I explicitly change the short branch in the code emitted by svd2rust
into the long form, it'll happily compile and link the binaries, cf.:
diff --git a/src/svd.rs b/src/svd.rs
index 149a3ed..8bfc3f2 100644
--- a/src/svd.rs
+++ b/src/svd.rs
@@ -11,7 +11,7 @@ pub mod interrupt {
"
.thumb_func
DH_TRAMPOLINE:
- b DEFAULT_HANDLER
+ bl DEFAULT_HANDLER
"
);
#[cfg(feature = "rt")]
@x37v Can you with the above change on your code?
@therealprof yes, that did solve my problem! THANKS SO MUCH!
I was actually also able to build without --release
once i set the optimization level to 1.. tried that in the mean time, but this works unoptimized!
This can't be right, using branch with link will clobber the link register, so the interrupt handlers will fail to return. To make it work with bl
, we need to add a proper function.
Also, the documentation here says that b.w
instruction can be used in thumb 2 mode to jump to ±16MB. But LLVM says it's only available on thumbv7 for some reason.
@pftbest Whoops, you're right. I picked the wrong mnemonic, b.w is what I wanted to say. And you're also right that it can't be used due to the compiler/assembler complaining. :(
@pftbest Okay, I checked around and there's nothing that would the assembly accept that function, BUT: why do we do that manual jump in the first place? DH_TRAMPOLINE doesn't do anything useful so we might as well just get rid of that...
diff --git a/src/svd.rs b/src/svd.rs
index 149a3ed..df566cf 100644
--- a/src/svd.rs
+++ b/src/svd.rs
@@ -8,15 +8,7 @@ pub mod interrupt {
use bare_metal::Nr;
#[cfg(feature = "rt")]
global_asm!(
- "
- .thumb_func
- DH_TRAMPOLINE:
- b DEFAULT_HANDLER
- "
- );
- #[cfg(feature = "rt")]
- global_asm!(
- "\n.weak WWDG\nWWDG = DH_TRAMPOLINE\n.weak PVD\nPVD = DH_TRAMPOLINE\n.weak RTC\nRTC = DH_TRAMPOLINE\n.weak FLASH\nFLASH = DH_TRAMPOLINE\n.weak RCC_CRS\nRCC_CRS = DH_TRAMPOLINE\n.weak EXTI0_1\nEXTI0_1 = DH_TRAMPOLINE\n.weak EXTI2_3\nEXTI2_3 = DH_TRAMPOLINE\n.weak EXTI4_15\nEXTI4_15 = DH_TRAMPOLINE\n.weak TSC\nTSC = DH_TRAMPOLINE\n.weak DMA_CH1\nDMA_CH1 = DH_TRAMPOLINE\n.weak DMA_CH2_3\nDMA_CH2_3 = DH_TRAMPOLINE\n.weak DMA_CH4_5_6_7\nDMA_CH4_5_6_7 = DH_TRAMPOLINE\n.weak ADC_COMP\nADC_COMP = DH_TRAMPOLINE\n.weak TIM1_BRK_UP_TRG_COM\nTIM1_BRK_UP_TRG_COM = DH_TRAMPOLINE\n.weak TIM1_CC\nTIM1_CC = DH_TRAMPOLINE\n.weak TIM2\nTIM2 = DH_TRAMPOLINE\n.weak TIM3\nTIM3 = DH_TRAMPOLINE\n.weak TIM14\nTIM14 = DH_TRAMPOLINE\n.weak TIM16\nTIM16 = DH_TRAMPOLINE\n.weak TIM17\nTIM17 = DH_TRAMPOLINE\n.weak I2C1\nI2C1 = DH_TRAMPOLINE\n.weak SPI1\nSPI1 = DH_TRAMPOLINE\n.weak SPI2\nSPI2 = DH_TRAMPOLINE\n.weak USART1\nUSART1 = DH_TRAMPOLINE\n.weak USART2\nUSART2 = DH_TRAMPOLINE\n.weak CEC_CAN\nCEC_CAN = DH_TRAMPOLINE\n.weak USB\nUSB = DH_TRAMPOLINE"
+ "\n.weak WWDG\nWWDG = DEFAULT_HANDLER\n.weak PVD\nPVD = DEFAULT_HANDLER\n.weak RTC\nRTC = DEFAULT_HANDLER\n.weak FLASH\nFLASH = DEFAULT_HANDLER\n.weak RCC_CRS\nRCC_CRS = DEFAULT_HANDLER\n.weak EXTI0_1\nEXTI0_1 = DEFAULT_HANDLER\n.weak EXTI2_3\nEXTI2_3 = DEFAULT_HANDLER\n.weak EXTI4_15\nEXTI4_15 = DEFAULT_HANDLER\n.weak TSC\nTSC = DEFAULT_HANDLER\n.weak DMA_CH1\nDMA_CH1 = DEFAULT_HANDLER\n.weak DMA_CH2_3\nDMA_CH2_3 = DEFAULT_HANDLER\n.weak DMA_CH4_5_6_7\nDMA_CH4_5_6_7 = DEFAULT_HANDLER\n.weak ADC_COMP\nADC_COMP = DEFAULT_HANDLER\n.weak TIM1_BRK_UP_TRG_COM\nTIM1_BRK_UP_TRG_COM = DEFAULT_HANDLER\n.weak TIM1_CC\nTIM1_CC = DEFAULT_HANDLER\n.weak TIM2\nTIM2 = DEFAULT_HANDLER\n.weak TIM3\nTIM3 = DEFAULT_HANDLER\n.weak TIM14\nTIM14 = DEFAULT_HANDLER\n.weak TIM16\nTIM16 = DEFAULT_HANDLER\n.weak TIM17\nTIM17 = DEFAULT_HANDLER\n.weak I2C1\nI2C1 = DEFAULT_HANDLER\n.weak SPI1\nSPI1 = DEFAULT_HANDLER\n.weak SPI2\nSPI2 = DEFAULT_HANDLER\n.weak USART1\nUSART1 = DEFAULT_HANDLER\n.weak USART2\nUSART2 = DEFAULT_HANDLER\n.weak CEC_CAN\nCEC_CAN = DEFAULT_HANDLER\n.weak USB\nUSB = DEFAULT_HANDLER"
);
#[cfg(feature = "rt")]
extern "C" {
That removes one unnecessary indirection from code that actually does compile and fixes this particular problem since we're jumping to the correct function right from our exception/interrupt table where we don't have any address limitations...
It'll likely not fix the problem I had before wrt. functions becoming too big to be jumped to (which I've addressed in the code) and should be addressed by the compiler, but this seems like a win-win to.
NB: I have no hardware here so I can't very it but it sure looks good to me.
No, we can't remove this trampoline, because it will silently break non-lto builds. Weak references can only point to symbols defined in the same object file, but default handler is defined in another crate, so it will end up in different object file. This bug was reported here: https://github.com/japaric/cortex-m-rtfm/issues/39
I think the only working solution here is to make DH_TRAMPOLINE
a proper rust function. This will make executable slightly bigger, unfortunately, but it shouldn't affect the performance, because default_handler is only used for error handling.
It may affect a stack trace when debugging, not sure if it counts as a breaking change. I don't have a board atm so I can't test it.
Hm, non-lto builds... Those still exist? ;)
I'll have to look a bit closer at this in a non-lto context. I'm still not exactly sure why the trampoline needs to exist at all, my preference would be to fix the visibility of the symbols. As I said before this will most likely not fix the compiler issue at hand (refusing wo accept the b.w for armv6m) so it's very likely that we will run into the same problem sooner or later again... Not sure how to properly report this though.
interesting, @pftbest, bl DEFAULT_HANDLER
does build for me, debugging is now more full featured with dev builds and I am able to get ADC interrupts at least.. maybe I'm confused about where the discussion has gone.
@therealprof and @pftbest I could try to get an stm32f0 based discovery board to you if you want some hardware to test on.. Though I can also run tests on my hardware if you'd like.. about to be gone for a long weekend later tonight though.
@pftbest is right that BL
clobbers the link register so technically we can not return to wherever the link register was legitimately set. However I'm not sure that this is relevant because we're talking about the default handler here which usually just halts execution by firing off an breakpoint instruction.
@x37v No need, I have a ton of STM32 stuff here and I especially like the F0 series which is probably why @japaric notified me in the first place. ;)
@x37v, the issue here is that processor relies on EXC_RETURN
value being present in LR
register to return from the interrupt handler, but bl
instruction will erase it, so it will never return.
This does not break the provided default_handler, since it goes into infinite loop and never returns, but the user may override it using default_handler!
macros, and try to return from it.
@pftbest I'm still a bit confused... my understanding is that interrupt handlers get executed after an interrupt arrives and execution jumps out of your main loop [in the rtfm case a loop waiting for interrupts] execute some code and then jump back. Are you saying that the default handler, before being overridden, normally goes into an infinite loop and never returns to the main loop?.. or is this simply an effect of the bl
instruction?
@x37v The default handler is only used if the system fires an exception or an interrupt and you haven't provided your own exception or interrupt handler. You can override the default_handler, too if you want to do anything specific in this case however the default implementation is more or less the only sane implementation one can have in this situation: Set a breakpoint and do nothing more.
@therealprof AHH, that makes sense. So, beyond the potential override, is it problematic as is, with no way to return?
@x37v At that point the MCU is pretty much in a dead end, so other than saying goodbye I don't think there's much you can do to re-enter the program in orderly fashion other than a reset... Even if you have the link register; who say's it points to a place where you can actually reenter?
@therealprof, why is MCU in a dead end?
Nothing serious would happen if we just return from some unhandled GPIO interrupt. There is a way to get the interrupt number that is currently being serviced, so a reasonable implementation may check that we are not in hard fault or some other bad state, and otherwise just log a spurrious interrupt and return.
@pftbest Why would you enable an interrupt you're not willing to handle? And if your willing to handle it, why not have a specific handler for that? Using the default handler has a number of drawbacks; sure with enough effort you might be able to figure out why ended in there but all the exceptions you're not willing to deal with also end up in there, i.e. the really bad stuff from which a useful recovery is typically not possible.
There's a reason that in 99.99% of all cases the default handler is used to
or any combination thereof.
@pftbest You're right. The easiest way to make that work seems to be a proper Rust function; I tried all kinds of tricks with assembly but the simplest solution is the obvious one:
extern "C" {
fn DEFAULT_HANDLER();
}
#[allow(non_snake_case)]
#[naked]
#[no_mangle]
pub unsafe fn DH_TRAMPOLINE() {
DEFAULT_HANDLER();
}
The binary code grows by 4 bytes.
It also adds the additional benefit of properly naming the function, but here's the kicker; it also uses the bl
instruction:
│ -08000480 <ADC_COMP>:
│ +08000480 <DH_TRAMPOLINE>:
│ - 8000480: e059 b.n 8000536 <BUS_FAULT>
│ + 8000480: f000 f85b bl 800053a <BUS_FAULT>
│ + 8000484: 4770 bx lr
🤔
Hi
A comment +/- related.
A sensible thing to do in the default handler is to unwind the stack, that makes it easier to trace the cause of the error. (I made some tests earlier and it works…) The question is just what to do with the “trace info”. I sent it over the ITM, but as there is no flow control for the ITM, the debugger buffer (on the nucleo/st-link) may overflow, so you may loose characters. One can also inspect the stack directly from gdb, perhaps there is some scripting possible to facilitate this (Japaric might know…)
Best, Per
On 05 Sep 2017, at 00:58, therealprof notifications@github.com<mailto:notifications@github.com> wrote:
@pftbesthttps://github.com/pftbest You're right. The easiest way to make that work seems to be a proper Rust function; I tried all kinds of tricks with assembly but the simplest solution is the obvious one:
extern "C" {
fn DEFAULT_HANDLER();
}
#[allow(non_snake_case)]
#[naked]
#[no_mangle]
pub unsafe fn DH_TRAMPOLINE() {
DEFAULT_HANDLER();
}
The binary code grows by 4 bytes.
It also adds the additional benefit of properly naming the function, but here's the kicker; it also uses the bl instruction:
│ -08000480
🤔
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/japaric/cortex-m-rtfm/issues/42#issuecomment-327036730, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AD5naDUQ9VcMi_3hjpGJvYp8OhziHoYjks5sfICigaJpZM4PIR-X.
{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/japaric/cortex-m-rtfm","title":"japaric/cortex-m-rtfm","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/japaric/cortex-m-rtfm"}},"updates":{"snippets":[{"icon":"PERSON","message":"@therealprof in #42: @pftbest You're right. The easiest way to make that work seems to be a proper Rust function; I tried all kinds of tricks with assembly but the simplest solution is the obvious one:\r\n\r\n extern \"C\" {\r\n fn DEFAULT_HANDLER();\r\n }\r\n\r\n #[allow(non_snake_case)]\r\n #[naked]\r\n #[no_mangle]\r\n pub unsafe fn DH_TRAMPOLINE() {\r\n DEFAULT_HANDLER();\r\n }\r\n
\r\n\r\nThe binary code grows by 4 bytes.\r\n\r\nIt also adds the additional benefit of properly naming the function, but here's the kicker; it also uses the bl
instruction:\r\n\r\n│ -08000480 \u003cADC_COMP\u003e:\r\n│ +08000480 \u003cDH_TRAMPOLINE\u003e:\r\n│ - 8000480: e059 b.n 8000536 \u003cBUS_FAULT\u003e\r\n│ + 8000480: f000 f85b bl 800053a \u003cBUS_FAULT\u003e\r\n│ + 8000484: 4770 bx lr\r\n
\r\n\r\n🤔"}],"action":{"name":"View Issue","url":"https://github.com/japaric/cortex-m-rtfm/issues/42#issuecomment-327036730"}}}
@therealprof I think you forgot to remove the #[naked]
attribute, that's why the generated code is incorrect. Naked functions can only have inline assembly inside, not the actual code.
Maybe we can do better, by having 2 functions behind a #[cfg(target
, one for thumbv7+ that does b.w
and one for thumbv6 that does a normal function call.
@pftbest Hm, right again... this is becoming uncanny. ;)
However now we have the same function twice with two different labels:
0800031c <DH_TRAMPOLINE>:
800031c: f3ef 8008 mrs r0, MSP
8000320: e7ff b.n 8000322 <_ZN11cortex_m_rt15default_handler17hc8869ed4a288b164E>
08000322 <_ZN11cortex_m_rt15default_handler17hc8869ed4a288b164E>:
8000322: be00 bkpt 0x0000
8000324: e7fe b.n 8000324 <_ZN11cortex_m_rt15default_handler17hc8869ed4a288b164E+0x2>
08000326 <BUS_FAULT>:
8000326: f3ef 8008 mrs r0, MSP
800032a: e7fa b.n 8000322 <_ZN11cortex_m_rt15default_handler17hc8869ed4a288b164E>
Duh, well.
Maybe we can do better, by having 2 functions behind a #[cfg(target, one for thumbv7+ that does b.w and one for thumbv6 that does a normal function call.
There's no reason for that. Both actually support the very same b.w
instruction. It's just the compiler being wrong here and claiming that it wouldn't.
I believe LLVM is correct in this case, quote from the docs:
ARMv6-M supports the Thumb instruction set, including a small number of 32-bit instructions introduced to the architecture as part of the Thumb-2 technology in ARMv6T2. ARMv6-M supports the 16-bit Thumb instructions from ARMv7-M, in addition to the 32-bit BL, DMB, DSB, ISB, MRS and MSR instructions.
b.w
is a 32bit instruction and it's not on the list, so looks like it's not supported. (also it's not mentioned here)
However now we have the same function twice with two different labels:
Yes, DEFAULT_HANDLER gets inlined, that is unfortunate, but it may be fixed by this patches when they will be merged.
b.w is a 32bit instruction and it's not on the list, so looks like it's not supported. (also it's not mentioned here)
Hm, I can't find the reference at the moment but some site said that b.w
would be supported for Cortex-M0 as well but I guess you're right (again!).
Sorry for jumping in the discussion!
Actually the link @pftbest posted earlier clears this up perfecrly: See Table 12.
B _label_
's range is+- 16MB in case of 32-bit Thumb2, with the optional .B
, or +-2KB with Thumb1 variant.
Cortex-M0(+) only have BL, DMB, DSB, ISB, MRS, MSR
from Thumb2, all other instructions have the Thumb1 variant, including B
.
Basically ~all 16bit T1 instruction has a corresponding T2 sibling, some with subtle differences like this.
@Samonitari Right, however this wouldn't be the first time that the official documentation turns out to be incorrect. 😉
Really the only difference it makes is whether to report a bug to LLVM or not.
@Samonitari Yes, thumb1 has b
instrunction, but we need b.w
which is 32bit T2 instruction. There is no way to encode such instruction on Cortex-M0. So there is no bug in LLVM.
@pftbest's idea, namely:
Maybe we can do better, by having 2 functions behind a #[cfg(target, one for thumbv7+ that does b.w and one for thumbv6 that does a normal function call.
Sounds good to me. I'd be happy to merge a PR implementing that.
I'm already working on that :)
This turned out to be pretty complicated as the mentioned #[cfg] would have to be put into every crate generated by svd2rust, so they all need a build.rs setting some armv6m
cfg option since there's no other way to distinguish between v6 and v7.
I guess this is still fine, you just have to opt-in to get armv6 support...
Note that the issue isn't truly fixed until the stm32f030 crate is regenerated with an up-to-date svd2rust.
I figure this is worth a patch version update?
@jonas-schievink
the crate: https://gitlab.com/xnor/stm32f030/commit/2deff1fe6844da030c859c8f2b372a018f7d1ad4
example project: https://gitlab.com/xnor/stm32f0308-disco-rust/commit/cc0271624ad60bbf428f99a1f662da59394706b9
Builds and debugs in dev! :+1: Thanks All!
Just stumbled onto this so I might of missed something, but about about just always using
ldr r0, =DEFAULT_HANDLER
bx r0
then you have unlimited range.
@parched but you will loose the value in r0 register. and you may want to know the value for debugging purposes.
here is an example project: https://gitlab.com/xnor/stm32f0308-disco-rust
If I build it without --release I get