Closed sreibs closed 8 years ago
Good morning,
What MCU are you using exactly? It seems that the stack pointer is initialized to a faulty address. (PSR register)
The SRAM address starts at 0x0200000
Is the schematic available somewhere?
Sorry psr is not the stack register, but I'm still wondering is there is a schematic available :)
Hi,
Thank you for quick reply.
The MCU is a STM32F0C8T6.
The board is a home made board. Connection are very simple: 2 SPI for an ADC and a radio transceiver, 4 logic level signals that have been disconnected for debug. The reset is tied high. The interface for programming (SBW) and... that's it.
Do you see a reason why the program works fine with a power supply and not another? Is hardfault can be triggered by a voltage problem or other hardware causes?
And there is also a 32khz quartz
The first thing that come to mind are the differences in power supply circuit. I assume that the USB power supply has a voltage regulator. How about the power circuit where the external power supply is connected to? Can the external power supply also deliver 500mA? Or 757mA if it's a 3v3 power supply? If you inject the 3v3 after the USB power supply is the USB voltage regulator disconnected from the Vcc? Some regulators don't like it when there is a voltage on the out pin while there is none on the in pin. If the power is injected after the USB power supply does it have capacitors on the line?
Do you have a scope? You can look if the power on the board is nice an clean without to much noise.
You could measure the current the board draws when connected to USB and when it's connected to the external PSU. If there is a big difference there is probably an error in the power circuit.
Do you use / initialize the USB? (if it has any capabilities for usb. I'm on the road =) )
Maybe a brown-out is triggered due to a dip in the voltage.
You could single step trough your code from 'main()' and see where the hard fault occurs maybe when you try to initialize some external peripheral?
Yes the main voltage is going through a 5V LDO then a 3,3V LDO. Power signal are clean for both power supply (I have a scope).
Current consumption is normal (about 50mA max), for both power supply.
I tried with a "hello world" main (so no ext peripheral initialized neither used) and the problem is still here (see first trace attached).
I first thought was a brown out, but first, I can't see any on the scope (µs resolution) and second, I am not sure it is a cause for HardFault in cortex M0 ?
I observe that PC is almost always at "pc: 0x08000b1e". Could it be a clue?
What you can do is look at the map file. Created in ${RIOT_BASE}/examples/hello-world/bin/${BOARD}/hello-world.map
and see what is placed there. When I compile for the board iotlab-m3
for example I see that cpu_init
is placed on that address. But for your board it will probably another function ;)
.text.atomic_cas
0x0000000008000a48 0x1c /home/dipswitch/RIOT/examples/hello-world/bin/iotlab-m3/cortexm_common.a(atomic_arch.o)
0x0000000008000a48 atomic_cas
.text.cpu_init
0x0000000008000a64 0xbc /home/dipswitch/RIOT/examples/hello-world/bin/iotlab-m3/cpu.a(cpu.o)
0x0000000008000a64 cpu_init
.text.lpm_arch_set
0x0000000008000b20 0x4 /home/dipswitch/RIOT/examples/hello-world/bin/iotlab-m3/cpu.a(lpm_arch.o)
0x0000000008000b20 lpm_arch_set
And the MCU STM32F0C8T6
is not listed on the STM32F0 page. You're probably missing two characters there STM32F0##C8T6 would make sencse :)
You seem to be missing a digit from the part number. It should be of the form STM32F0xC8T6 where the x is a digit. Is this the case?
And brown-out would possibly trigger an reset indeed and not a hard fault :)
Since the difference shows between two power supplies, I'm wondering if power supply rise time might be putting things into some sort of funky mode? Need the full part number to check into it.
Hard Fault seems to be mostly triggered by flash protection level violation.
And since the return address of the given function the hard fault occurs on is # EXC_RET: 0xfffffffd
it is probably an interrupt handler that is hard faulting your system. What peripherals are initialized in your board configuration?
@punchcard60 Or, misaligned memory access (m0 can't access misaligned memory), null pointer referencing, bus faults (although not in this case I guess), access unavailable memory address (like a peripheral base address which is incorrect).
Sorry about that, MCU is STM32F030C8T6.
I am not sure that the rise time is problem as the begining of the program run normally (you can see the Hello word in the trace and hard fault occured 5ms later.
Thank you for the check, a hard fault cannot come from a brown out.
I'll check the .map file
a hard fault cannot come from a brown out.
But from a misbehaving interrupt caused by a brown out.
In the helloworld exemple nothing is initialized.
How can I know the interupt the hardfault handler is called from?
Le sam. 12 déc. 2015 13:39, Kaspar Schleiser notifications@github.com a écrit :
a hard fault cannot come from a brown out.
But from a misbehaving interrupt caused by a brown out.
— Reply to this email directly or view it on GitHub https://github.com/RIOT-OS/RIOT/issues/4470#issuecomment-164146167.
@DipSwitch yeah, no requirements for rise time.
This you could see in the map file :) Already found the function?
And that nothing is initialized is not entirely true, the system timer and RTT and maybe transceiver you configured in your board_perh.h
do get initialized by the auto_init()
function.
How can I know the interupt the hardfault handler is called from?
Check the map where PC points to.
In the map I have .text.idle_thread 0x08000b18 0x14 /home/seb/EmbeddedArm/emb/RIOT/examples/wattwatcher_app/bin/wattwatcher1/core.a(kernel_init.o) .text.kernel_init 0x08000b2c 0x94 /home/seb/EmbeddedArm/emb/RIOT/examples/wattwatcher_app/bin/wattwatcher1/core.a(kernel_init.o) 0x08000b2c kernel_init .text 0x08000bc0 0x0 /home/seb/EmbeddedArm/emb/RIOT/examples/wattwatcher_app/bin/wattwatcher1/core.a(msg.o)
Sometimes PC is 0x08000956: .text.uart_write_blocking 0x0800094c 0x18 /home/seb/EmbeddedArm/emb/RIOT/examples/wattwatcher_app/bin/wattwatcher1/periph.a(uart.o) 0x0800094c uart_write_blocking .text.uart_poweron 0x08000964 0x28 /home/seb/EmbeddedArm/emb/RIOT/examples/wattwatcher_app/bin/wattwatcher1/periph.a(uart.o) 0x08000964 uart_poweron
Sometimes it is 0x080070FE: .text._printf_i 0x08007028 0x230 /home/seb/EmbeddedArm/gcc-arm-none-eabi-4_9-2015q1/bin/../lib/gcc/arm-none-eabi/4.9.3/../../../../arm-none-eabi/lib/armv6-m/libc_nano.a(lib_a-nano-vfprintf_i.o) 0x08007028 _printf_i .text 0x08007258 0x0 /home/seb/EmbeddedArm/gcc-arm-none-eabi-4_9-2015q1/bin/../lib/gcc/arm-none-eabi/4.9.3/../../../../arm-none-eabi/lib/armv6-m/libc_nano.a(lib_a-stdio.o) .text.__sread 0x08007258 0x28 /home/seb/EmbeddedArm/gcc-arm-none-eabi-4_9-2015q1/bin/../lib/gcc/arm-none-eabi/4.9.3/../../../../arm-none-eabi/lib/armv6-m/libc_nano.a(lib_a-stdio.o) 0x08007258 __sread
Could it be a problem with printing on the UART? Why would it work with another power supply?
@DipSwitch I have to go for a couple of hour. I will check for initialization when I get back. Thank you for your help !
Le sam. 12 déc. 2015 à 13:46, Kaspar Schleiser notifications@github.com a écrit :
How can I know the interupt the hardfault handler is called from?
Check the map where PC points to.
— Reply to this email directly or view it on GitHub https://github.com/RIOT-OS/RIOT/issues/4470#issuecomment-164146431.
Are you running from the internal or external crystal? Do you use the PLL? I've also seen this behavior before when you run from 8Mhz with the debugger connected, for some reason the debugger interfere with the MCU, disabling all breakpoints could solves the problem. If the location is random, it could mean that the clock is unstable (which can occur if the external crystal doesn't have the proper capacitors to ground) or when the power is unstable.
If it's always after 5 minutes my first guess would be a timer of some sort though...
And the kernel_init
is strange since after Hello World
the kernel_init should never be called, unless the MCU resets...
It is not 5min but 5ms. It actually occured between 5 and 200ms.
If there is a reset it should be visible in the trace, shouldn't it?
I don't have the debugger connected (by the way I've also seen that in other project).
I have an external crystal but only 32k for time counting, it is not the main frequency.
Le sam. 12 déc. 2015 13:55, DipSwitch notifications@github.com a écrit :
Are you running from the internal or external crystal? Do you use the PLL? I've also seen this behavior before when you run from 8Mhz with the debugger connected, for some reason the debugger interfere with the MCU, disabling all breakpoints could solves the problem. If the location is random, it could mean that the clock is unstable (which can occur if the external crystal doesn't have the proper capacitors to ground) or when the power is unstable.
If it's always after 5 minutes my first guess would be a timer of some sort though...
— Reply to this email directly or view it on GitHub https://github.com/RIOT-OS/RIOT/issues/4470#issuecomment-164147588.
If it is a timing problem (due to a rising time or a dip voltage) would it be possible to reset the MCU on a hardfault rather than freezing it?
Le sam. 12 déc. 2015 14:31, Sebastien Risler sebastien.risler@gmail.com a écrit :
It is not 5min but 5ms. It actually occured between 5 and 200ms.
If there is a reset it should be visible in the trace, shouldn't it?
I don't have the debugger connected (by the way I've also seen that in other project).
I have an external crystal but only 32k for time counting, it is not the main frequency.
Le sam. 12 déc. 2015 13:55, DipSwitch notifications@github.com a écrit :
Are you running from the internal or external crystal? Do you use the PLL? I've also seen this behavior before when you run from 8Mhz with the debugger connected, for some reason the debugger interfere with the MCU, disabling all breakpoints could solves the problem. If the location is random, it could mean that the clock is unstable (which can occur if the external crystal doesn't have the proper capacitors to ground) or when the power is unstable.
If it's always after 5 minutes my first guess would be a timer of some sort though...
— Reply to this email directly or view it on GitHub https://github.com/RIOT-OS/RIOT/issues/4470#issuecomment-164147588.
I scoped the boot up voltage. The rise time is roughly 350µs. The overshoot is 500mV above 5V during 50µs. The the voltage is clean.
On the 3,3V supply line the rise time is obviously shorter (150µs). The overshoot is 100mV above 3,3V.
Right after the overshoot the line is flat.
I don't see why it would be a brown out as the hardfault occurs 200ms after first MCU output on UART.
I tested to supply a single MCU with the power supply that cause the problem. The problem is the same. I guess it is not as clean as I can see on the scope. I still don't understand why a "hardfault" is triggered.
Maybe an interrupt is thrown and there is no handler. But this MCU does not even have a Power Voltage Detector. I can't see where the interrupt can come from.
The more I think about it the more I'm sure that @Dipswitch is right. It has to be something about the circuit that controls which supply (USB/Line) powers the MPU. If it was a spike it wouldn't happen on the same address each time unless the spike is from turning on/off some other device on the board.
I also think so. However I still don't understand why a hardfault is triggered...
The address of PC is not on "kernel_init" but on "idel_thread". The software doing almost nothing, it is likely that the program is almost always in Idle thread... so it makes sense.
If it is something with the supply, I can't understand why the program is able to boot up, write correctly some data on UART (meaning the clock is stable) and suddenly stops.
I posted on ST forum to double check the voltage sensitivity of this MCU.
Any other lead is welcome.
Just to confirm - you're running the hello world example in a totally unmodified copy of RIOT or have there been changes? What is happening with the usb at the time?
On December 12, 2015 6:29:52 PM CST, srisler notifications@github.com wrote:
I also think so. However I still don't understand why a hardfault is triggered...
The address of PC is not on "kernel_init" but on "idel_thread". The software doing almost nothing, it is likely that the program is almost always in Idle thread... so it makes sense.
If it is something with the supply, I can't understand why the program is able to boot up, write correctly some data on UART (meaning the clock is stable) and suddenly stops.
I posted on ST forum to double check the voltage sensitivity of this MCU.
Any other lead is welcome.
Reply to this email directly or view it on GitHub: https://github.com/RIOT-OS/RIOT/issues/4470#issuecomment-164204482
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Hi,
I actually made some change on RIOT, first because my MCU was not supported neither my board obviously. And I am not on the last release.
I will test to run on the last release today.
Le dim. 13 déc. 2015 03:40, Jon Pattison notifications@github.com a écrit :
Just to confirm - you're running the hello world example in a totally unmodified copy of RIOT or have there been changes? What is happening with the usb at the time?
On December 12, 2015 6:29:52 PM CST, srisler notifications@github.com wrote:
I also think so. However I still don't understand why a hardfault is triggered...
The address of PC is not on "kernel_init" but on "idel_thread". The software doing almost nothing, it is likely that the program is almost always in Idle thread... so it makes sense.
If it is something with the supply, I can't understand why the program is able to boot up, write correctly some data on UART (meaning the clock is stable) and suddenly stops.
I posted on ST forum to double check the voltage sensitivity of this MCU.
Any other lead is welcome.
Reply to this email directly or view it on GitHub: https://github.com/RIOT-OS/RIOT/issues/4470#issuecomment-164204482
Sent from my Android device with K-9 Mail. Please excuse my brevity.
— Reply to this email directly or view it on GitHub https://github.com/RIOT-OS/RIOT/issues/4470#issuecomment-164212865.
Hi there,
I scoped the NRST signal and managed to see that there are falling edges sync with small spikes on the supply line (80mV <5µs)... Then I assume that it is definitely a hardware problem.
I still don't understand why it ends up to a Hardfault exeption and I am surprised this STM is such sensitive to voltage spike.
I have a thread open on STM forum to clarify this.
Thank you for your time and advices
The datasheet recommends a .1uF capacitor on the NRST pin to help minimize noise. I'm happy that you found the problem. Good Luck!
Yes I have the capacitor on the NRST. I measured voltage on each VDD and it is more 40mV spike, thanks to the decoupling capas. I find it very very sensitive to be the reason...
Hi everyone,
The problem came from the board and the way the STM was grounded.
I still don't know why it triggered a hardfault interrupt but it was well an hardware problem and not soft at all.
Thanks to everyone.
Hi everyone,
The problem came from the board and the way the STM was grounded.
I still don't know why it triggered a hardfault interrupt but it was well an hardware problem and not soft at all.
Thanks to everyone.
Can you give more details? I'm having similar problem.
Hi there,
I run RIOT on a STM32f0 micro and the HARD FAULT is triggered every time I turn the system on, with a particular power supply. For example, if the system is powered from USB it all works fine.
Apparently, voltage fault is not part of the reasons of a HardFault. But it seems this problem does come from a supply problem as it is the only difference between a working and a non working case.
I also removed all my applicative to change with the hello word, and the problem still occurs, see the trace attached.
I checked my voltage (3,3V), for all cases the voltage is clean and flat...
Do you think this problem is a supply problem? Or a soft problem?
Any help will be greatly appreciated !
Thank you
2015-12-12 11:08:08,347 - INFO # kernel_init(): This is RIOT! (Version: 14e5-XXXX) 2015-12-12 11:08:08,351 - INFO # kernel_init(): jumping into first task... 2015-12-12 11:08:08,353 - INFO # UART0 thread started. 2015-12-12 11:08:08,355 - INFO # uart0_init() [OK] 2015-12-12 11:08:08,356 - INFO # Hello World! 2015-12-12 11:08:08,361 - INFO # You are running RIOT on a(n) wattwatcher1 board. 2015-12-12 11:08:08,363 - INFO # This board features a(n) stm32f0 MCU. 2015-12-12 11:08:08,432 - INFO # 2015-12-12 11:08:08,434 - INFO # Context before hardfault: 2015-12-12 11:08:08,436 - INFO # r0: 0x00000001 2015-12-12 11:08:08,437 - INFO # r1: 0x00000001 2015-12-12 11:08:08,439 - INFO # r2: 0x00000002 2015-12-12 11:08:08,440 - INFO # r3: 0x681b2001 2015-12-12 11:08:08,442 - INFO # r12: 0x00000000 2015-12-12 11:08:08,444 - INFO # lr: 0x08000b25 2015-12-12 11:08:08,445 - INFO # pc: 0x08000b1e 2015-12-12 11:08:08,447 - INFO # psr: 0x01000000 2015-12-12 11:08:08,447 - INFO # 2015-12-12 11:08:08,448 - INFO # Misc 2015-12-12 11:08:08,449 - INFO # EXC_RET: 0xfffffffd 2015-12-12 11:08:08,453 - INFO # Attempting to reconstruct state for debugging... 2015-12-12 11:08:08,454 - INFO # In GDB: 2015-12-12 11:08:08,456 - INFO # set $pc=0x8000b1e 2015-12-12 11:08:08,457 - INFO # frame 0 2015-12-12 11:08:08,457 - INFO # bt