meshtastic / firmware

Meshtastic device firmware
https://meshtastic.org
GNU General Public License v3.0
3.52k stars 870 forks source link

[Bug]: Heltec T114 does not resume to application after forced shutdown #4651

Closed jhps closed 1 month ago

jhps commented 2 months ago

Category

Other

Hardware

Other

Firmware Version

2.5.0.d6dac17

Description

A long press and "Shutting down" on the Heltec T110 put it into an approx 18 uA drawing sleep. A press to wake it resulted in waking to the firmware loading disk emulation rather than the "Resuming...".

Relevant log output

No response

todd-herbert commented 2 months ago

I'm not sure if this is a bug, or just normal behaviour for NRF52 devices? I know that my T-Echo does the same thing, at least with the newest bootloader.

I believe it is perfectly fine to press reset instead of the user button (?)

jhps commented 2 months ago

Afraid I do not know, but had assumed that it would be really useful to be able to recover to from the deepest possible sleep to a live node, possibly without physical user intervention.

BTW, I should have said BLE OTA screen rather than disk emulation as USB was not attached.

todd-herbert commented 2 months ago

Afraid I do not know, but had assumed that it would be really useful to be able to recover to from the deepest possible sleep to a live node, possibly without physical user intervention.

Does pressing the reset button instead of the user button achieve this correctly?

jhps commented 2 months ago

Yes, pressing reset behaves normally.

thebentern commented 1 month ago

I believe this is an issue with the custom bootloader rather than our firmware, but I'll keep this issue open until we get confirmation

lyusupov commented 1 month ago

I'm not sure if this is a bug, or just normal behaviour for NRF52 devices?



image



https://github.com/adafruit/Adafruit_nRF52_Bootloader/pull/196

Heltec-Aaron-Lee commented 1 month ago

image Reading this part of the code, you can know that the deep sleep of nRF is actually a delay. If the thread is empty, it will continue to delay, so pressing keys will not have any effect. And shutting down can be understood as another deeper sleep, rather than a physical power OFF. This is the mechanism of the nRF52840 chip, you can wake it up by pressing the reset button.

lyusupov commented 1 month ago

@Heltec-Aaron-Lee

Factory pre-installed SoftRF for LilyGO T-Echo does NOT have the "USER" button wake up issue:


image
todd-herbert commented 1 month ago

I've just tested with T-Echo (which has same issue waking from user button) Setting GPREGRET to DFU_MAGIC_SKIP fixes that issue: T-Echo now wakes from user button. Maybe this helps for T114 also? I don't know if there could be a negative consequence.

    } else {
        // Resume on user button press
        // https://github.com/lyusupov/SoftRF/blob/81c519ca75693b696752235d559e881f2e0511ee/software/firmware/source/SoftRF/src/platform/nRF52.cpp#L1738
        constexpr uint32_t DFU_MAGIC_SKIP = 0x6d;
        NRF_POWER->GPREGRET = DFU_MAGIC_SKIP;

        // FIXME, use system off mode with ram retention for key state?
        // FIXME, use non-init RAM per
        // https://devzone.nordicsemi.com/f/nordic-q-a/48919/ram-retention-settings-with-softdevice-enabled
        auto ok = sd_power_system_off();
        if (ok != NRF_SUCCESS) {
            LOG_ERROR("FIXME: Ignoring soft device (EasyDMA pending?) and forcing system-off!\n");
            NRF_POWER->SYSTEMOFF = 1;
        }
    }

(Added to src/platform/nrf52/main-nrf52.cpp)

jhps commented 1 month ago

I've just tested with T-Echo (which has same issue waking from user button) Setting GPREGRET to DFU_MAGIC_SKIP fixes that issue: T-Echo now wakes from user button. Maybe this helps for T114 also? I don't know if there could be a negative consequence.

    } else {
        // Resume on user button press
        // https://github.com/lyusupov/SoftRF/blob/81c519ca75693b696752235d559e881f2e0511ee/software/firmware/source/SoftRF/src/platform/nRF52.cpp#L1738
        constexpr uint32_t DFU_MAGIC_SKIP = 0x6d;
        NRF_POWER->GPREGRET = DFU_MAGIC_SKIP;

        // FIXME, use system off mode with ram retention for key state?
        // FIXME, use non-init RAM per
        // https://devzone.nordicsemi.com/f/nordic-q-a/48919/ram-retention-settings-with-softdevice-enabled
        auto ok = sd_power_system_off();
        if (ok != NRF_SUCCESS) {
            LOG_ERROR("FIXME: Ignoring soft device (EasyDMA pending?) and forcing system-off!\n");
            NRF_POWER->SYSTEMOFF = 1;
        }
    }

(Added to src/platform/nrf52/main-nrf52.cpp)

With this change my T114 wakes up into the application after 5 seconds with no button press.

todd-herbert commented 1 month ago

Interesting! Maybe not so straightforward then. Still, a new area to investigate. (Or new to me at least!)

jhps commented 1 month ago

The DFU_MAGIC_SKIP fix does work on the T114 board with a bare Arduino sketch, so, I would guess that the problem is that there are some things that need to be turned off.

todd-herbert commented 1 month ago

With this change my T114 wakes up into the application after 5 seconds with no button press

Actually I went back and checked again on my T-Echo, and it's also waking shortly after entering sleep too with that specific test change to the firmware (https://github.com/meshtastic/firmware/issues/4651#issuecomment-2342970894)

todd-herbert commented 1 month ago

The DFU_MAGIC_SKIP fix does work on the T114 board with a bare Arduino sketch, so, I would guess that the problem is that there are some things that need to be turned off.

Interesting! Sounds like there's potential. Might need to poke into what's actually going on in that GPREGRET register in the data sheet.

todd-herbert commented 1 month ago

Using T-Echo to test, it seems to work correctly now if GPREGRET is set using the SoC API instead of manipulated directly:

else {
        // Resume on user button press
        // https://github.com/lyusupov/SoftRF/blob/81c519ca75693b696752235d559e881f2e0511ee/software/firmware/source/SoftRF/src/platform/nRF52.cpp#L1738
        constexpr uint32_t DFU_MAGIC_SKIP = 0x6d;
        sd_power_gpregret_set(0, DFU_MAGIC_SKIP); // Equivalent NRF_POWER->GPREGRET = DFU_MAGIC_SKIP

        // FIXME, use system off mode with ram retention for key state?
        // FIXME, use non-init RAM per
        // https://devzone.nordicsemi.com/f/nordic-q-a/48919/ram-retention-settings-with-softdevice-enabled
        auto ok = sd_power_system_off();
        if (ok != NRF_SUCCESS) {
            LOG_ERROR("FIXME: Ignoring soft device (EasyDMA pending?) and forcing system-off!\n");
            NRF_POWER->SYSTEMOFF = 1;
        }
    }

(Added to src/platform/nrf52/main-nrf52.cpp)

jhps commented 1 month ago

This sd_power_gpregret_set(0, DFU_MAGIC_SKIP) does work for me on a T114.

One curious thing is that sometimes the draw about is 17 uA and sometimes 1.2 mA. Is there source or good documentation on what the sd_* stuff is doing under the hood?

todd-herbert commented 1 month ago

It's all well outside my comfort zone to be honest. As far as I understand it, the actual code being run is a bit of a blackbox(?), but it looks like there's at least an API reference here

Are you noticing that the power consumption is different after setting GPREGRET, or was it doing something similar before? I don't really know the implications of playing with this low-level NRF stuff so interested to hear if it seems to have an impact.

jhps commented 1 month ago

I did not do enough tests to be sure that the higher draw was not happening before, as it seems to be intermittent. If I knew how to provoke it...

jhps commented 1 month ago

Why close as "not planned" when the fix is trivial?

caveman99 commented 1 month ago

Why close as "not planned" when the fix is trivial?

Triage. If the fix is trivial, Pull Requests are welcome.

jhps commented 1 month ago

Why close as "not planned" when the fix is trivial?

Triage. If the fix is trivial, Pull Requests are welcome.

I will make one if todd-herbert does not get to it first.

todd-herbert commented 1 month ago

That's my fault for not chasing this up faster, sorry. I had been intending to talk to one of the bootloader experts about how this might impact hardware with softdevice 7.x