raspberrypi / pico-sdk

BSD 3-Clause "New" or "Revised" License
3.69k stars 917 forks source link

Feature request: Early GPIO startup control & state latching #959

Open Gadgetoid opened 2 years ago

Gadgetoid commented 2 years ago

We have a few applications of the RP2040 chip and Pico W-as-a-SOM where it's important for us to:

  1. Assert some GPIO pin really early on startup
  2. Know the state the GPIO was in as early as possible on startup

When writing a basic program in C++ it's possible to assert a pin and get the startup GPIO state some 15ms after a cold boot, simply by doing it in the top of main, but this depends on there being absolutely no preinit/init happening before main is reached.

In real-world scenarios there could be large quantities of zero init BSS (MicroPython for Pico W's 166k gc_heap) or initialized SRAM which - in our tests - can move any existing methods of hooking into early startup to some 150ms after the rising edge of RESET. This is enough to miss even a user button press, much less a transient pulse from a hardware peripheral.

We can do better by modifying crt0.S to init the RP2040 peripherals before SRAM load and zeroing the BSS. This is achieved by splitting runtime_init into runtime_init and runtime_reset_peripherals the latter of which is called before BSS and SRAM init, immediately followed by runtime_user_init (this is a very rough proof of concept) which latches our VSYS_EN.

This takes a (MicroPython PicoW) 150ms time from rising edge of RESET to VSYS_EN_PIN HIGH right down to about 8ms, beating even a vanilla C++ example with code in main.

Why? In our application VSYS_EN must be held high for the Pico W to be powered. We're shutting the chip fully down and powering it back on with a mix of external events. We'd like to know what event "woke" the chip, even if it's transient. Down to a limit, of course. And bearing in mind the chip is off so interrupts et al are no go.

To summarize:

I imagine this being configured with something like -DPICO_STARTUP_PIN_STATE and -DPICO_STARTUP_PIN_MASK so a build can be configured with any permutation of (not system essential) pins set to a desired state. This may include pulls but should probably only focus on SIO.

For the programming minded (or those who want to deploy this themselves) here's a diff showing my changes to runtime.c and crt0.S:

diff --git a/src/rp2_common/pico_runtime/runtime.c b/src/rp2_common/pico_runtime/runtime.c
index 70dd3bb..3326992 100644
--- a/src/rp2_common/pico_runtime/runtime.c
+++ b/src/rp2_common/pico_runtime/runtime.c
@@ -17,6 +17,7 @@
 #include "hardware/clocks.h"
 #include "hardware/irq.h"
 #include "hardware/resets.h"
+#include "hardware/gpio.h"

 #include "pico/mutex.h"
 #include "pico/time.h"
@@ -61,11 +62,19 @@ void runtime_install_stack_guard(void *stack_bottom) {
                    | 0x10000000; // XN = disable instruction fetch; no other bits means no permissions
 }

-void runtime_init(void) {
+void runtime_user_init(void) {
+    const uint VSYS_EN_PIN = 10;
+    gpio_init(VSYS_EN_PIN);
+    gpio_set_dir(VSYS_EN_PIN, GPIO_IN);
+    gpio_pull_up(VSYS_EN_PIN);
+}
+
+void runtime_reset_peripherals(void) {
     // Reset all peripherals to put system into a known state,
     // - except for QSPI pads and the XIP IO bank, as this is fatal if running from flash
     // - and the PLLs, as this is fatal if clock muxing has not been reset on this boot
     // - and USB, syscfg, as this disturbs USB-to-SWD on core 1
+
     reset_block(~(
             RESETS_RESET_IO_QSPI_BITS |
             RESETS_RESET_PADS_QSPI_BITS |
@@ -86,7 +95,9 @@ void runtime_init(void) {
             RESETS_RESET_UART1_BITS |
             RESETS_RESET_USBCTRL_BITS
     ));
+}

+void runtime_init(void) {
     // pre-init runs really early since we need it even for memcpy and divide!
     // (basically anything in aeabi that uses bootrom)

diff --git a/src/rp2_common/pico_standard_link/crt0.S b/src/rp2_common/pico_standard_link/crt0.S
index b2992f6..80367da 100644
--- a/src/rp2_common/pico_standard_link/crt0.S
+++ b/src/rp2_common/pico_standard_link/crt0.S
@@ -9,6 +9,7 @@
 #include "hardware/regs/addressmap.h"
 #include "hardware/regs/sio.h"
 #include "pico/binary_info/defs.h"
+#include "hardware/regs/resets.h"

 #ifdef NDEBUG
 #ifndef COLLAPSE_IRQS
@@ -225,6 +226,17 @@ _reset_handler:
     cmp r0, #0
     bne hold_non_core0_in_bootrom

+    ldr r1, =runtime_reset_peripherals
+    blx r1
+
+    ldr r1, =runtime_user_init
+    blx r1
+
+    // Read GPIO state for front buttons and store
+    movs r3, 0xd0                // Load 0xd0 into r3
+    lsls r3, r3, 24              // Shift left 24 to get 0xd0000000
+    ldr r6, [r3, 4]              // Load GPIO state (0xd0000004) into r6
+
     // In a NO_FLASH binary, don't perform .data copy, since it's loaded
     // in-place by the SRAM load. Still need to clear .bss
 #if !PICO_NO_FLASH
@@ -251,6 +263,10 @@ bss_fill_test:
     cmp r1, r2
     bne bss_fill_loop

+    // runtime_wakeup_gpio_state gets zero init above
+    ldr r2, =runtime_wakeup_gpio_state   // Load output var addr into r2
+    str r6, [r2]                        // Store r6 to r2
+
 platform_entry: // symbol for stack traces
     // Use 32-bit jumps, in case these symbols are moved out of branch range
     // (e.g. if main is in SRAM and crt0 in flash)
@@ -314,6 +330,19 @@ data_cpy_table:
 runtime_init:
     bx lr

+.weak runtime_user_init
+.type runtime_user_init,%function
+.thumb_func
+runtime_user_init:
+    bx lr
+
+.weak runtime_reset_peripherals
+.type runtime_reset_peripherals,%function
+.thumb_func
+runtime_reset_peripherals:
+    bx lr
+
+
 // ----------------------------------------------------------------------------
 // If core 1 somehow gets into crt0 due to a spectacular VTOR mishap, we need to
 // catch it and send back to the sleep-and-launch code in the bootrom. Shouldn't
@@ -345,3 +374,9 @@ __get_current_exception:
 .align 2
     .equ HeapSize, PICO_HEAP_SIZE
 .space HeapSize
+
+.section .data._reset_handler
+.global runtime_wakeup_gpio_state
+.align 4
+runtime_wakeup_gpio_state:
+.word 0x00000000
\ No newline at end of file
kilograham commented 2 years ago

relates to #748

jimmo commented 2 years ago

@Gadgetoid There's no reason that MicroPython's gc heap needs to be zero initialised at boot (in fact, the gc doesn't assume that it has been initialised, because on other platforms it comes from malloc).

It should be pretty straightforward to put it in a different section (e.g. .noinit), or better do what we do on stm32 where it's just defined entirely in the linker config as a start and end symbol used directly by gc_init().

Gadgetoid commented 2 years ago

@jimmo the zero fill only accounted for something like 20ms of the startup delay, with the remaining ~120 being the initialised SRAM copy. Definitely worth the gains, but not the full picture.

I didn’t try to figure out what portion of the SRAM copy was our C++ modules versus MicroPython itself but we’re extremely cautious with RAM usage so I’d hope not much!

I need to improve my objdump/nm skills so I can figure out where the main thrust of the delay is coming from since - as you imply - fixing it downstream is also an option!

8ms from cold boot to pin toggle low level is hard to pass up though- even if we can get full MicroPython started in 50ms now.

Edit: Note I tested the affect noinit might have on gc_heap by building a version with a smaller gc_heap (around 16K IIRC) and that's how I measured the 20ms speedup. .noinit might gain slightly over 20ms.

In fact we have a patch in our build (excuse the patch of a patch here) to set the ROSC early and this dramatically speeds up the BSS/Zero init steps- https://github.com/pimoroni/pimoroni-pico/pull/480/commits/69493e5af76d6196beac30ad2592f6605244486c

ldr r0, =(ROSC_BASE + ROSC_DIV_OFFSET)
ldr r1, =0xaa2
str r1, [r0]

Just these lines alone in crt0.S will take a full startup from rising edge of reset to MicroPython toggling a pin from ~200ms to ~50ms.

So some way to set the ROSC would be another great addition to Pico SDK if a mode configurable startup is on the table.

lurch commented 2 years ago

Using the ROSC is mentioned briefly in #745 ?

Gadgetoid commented 1 year ago

Some more data for the finer-control-over-runtime case- https://github.com/pimoroni/badger2040/issues/10

Just having some weak ref empty functions that are called at various stages of boot would be handy as hooks to avoid having to replace entire portions of the runtime.