nanoframework / Home

:house: The landing page for .NET nanoFramework repositories.
https://www.nanoframework.net
MIT License
863 stars 78 forks source link

Application stops working after a while when StartLightSleep/StartDeepSleep/EnableWakeupByTimer used #1165

Closed KiwiBryn closed 1 year ago

KiwiBryn commented 2 years ago

Target name(s)

ESP32 wrover

Firmware version

1.8.0.629

Was working before? On which version?

NA

Device capabilities

ESP32 @ COM10 deployment area erased.

System Information HAL build info: nanoCLR running @ ESP32 built with ESP-IDF 1b16ef6 Target: ESP32 Platform: ESP32

Firmware build Info: Date: Sep 29 2022 Type: MinSizeRel build, chip rev. 3, without support for PSRAM CLR Version: 1.8.0.629 Compiler: GNU ARM GCC v8.4.0

OEM Product codes (vendor, model, SKU): 0, 0, 0

Serial Numbers (module, system): 00000000000000000000000000000000 0000000000000000

Target capabilities: Has nanoBooter: NO IFU capable: NO Has proprietary bootloader: YES

AppDomains:

Assemblies:

Native Assemblies: mscorlib v100.5.0.17, checksum 0x004CF1CE nanoFramework.Runtime.Native v100.0.9.0, checksum 0x109F6F22 nanoFramework.Hardware.Esp32 v100.0.7.3, checksum 0xBE7FF253 nanoFramework.Hardware.Esp32.Rmt v100.0.3.0, checksum 0x0A915860 nanoFramework.Device.OneWire v100.0.4.0, checksum 0xB95C43B4 nanoFramework.Networking.Sntp v100.0.4.4, checksum 0xE2D9BDED nanoFramework.ResourceManager v100.0.0.1, checksum 0xDCD7DF4D nanoFramework.System.Collections v100.0.1.0, checksum 0x2DC2B090 nanoFramework.System.Text v100.0.0.1, checksum 0x8E6EB73D nanoFramework.Runtime.Events v100.0.8.0, checksum 0x0EAB00C9 EventSink v1.0.0.0, checksum 0xF32F4C3E System.IO.FileSystem v1.0.0.0, checksum 0x3AB74021 System.Math v100.0.5.4, checksum 0x46092CB1 System.Net v100.1.5.0, checksum 0x5BAB8CB3 System.Device.Adc v100.0.0.0, checksum 0xE5B80F0B System.Device.Dac v100.0.0.6, checksum 0x02B3E860 System.Device.Gpio v100.1.0.6, checksum 0x097E7BC5 System.Device.I2c v100.0.0.1, checksum 0xFA806D33 System.Device.Pwm v100.1.0.4, checksum 0xABF532C3 System.IO.Ports v100.1.6.1, checksum 0xB798CE30 System.Device.Spi v100.1.2.0, checksum 0x3F6E2A7E System.Device.Wifi v100.0.6.4, checksum 0x1C1D3214 Windows.Storage v100.0.2.0, checksum 0x954A4192

++++++++++++++++++++++++++++++++ ++ Memory Map ++ ++++++++++++++++++++++++++++++++ Type Start Size ++++++++++++++++++++++++++++++++ RAM 0x3ffe49ac 0x0001b000 FLASH 0x00000000 0x00400000

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ++ Flash Sector Map ++ +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Region Start Blocks Bytes/Block Usage +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 0 0x00010000 1 0x1A0000 nanoCLR 1 0x001B0000 1 0x1F0000 Deployment 2 0x003C0000 1 0x040000 Configuration

+++++++++++++++++++++++++++++++++++++++++++++++++++ ++ Storage Usage Map ++ +++++++++++++++++++++++++++++++++++++++++++++++++++ Start Size (kB) Usage +++++++++++++++++++++++++++++++++++++++++++++++++++ 0x003C0000 0x040000 (256kB) Configuration 0x00010000 0x1A0000 (1664kB) nanoCLR 0x001B0000 0x1F0000 (1984kB) Deployment

Deployment Map Empty

Description

After a while (couple of hours to couple of days depending on how often app restarts) the sample application which uses EnableWakeupByTimer + StartLightSleep/StartDeepSleep stops working.(only tested with StartDeepSleep)

How to reproduce

  1. At https://ptsv2.com I created a Toilet to dump HTTP post payloads in
  2. Create NF console application replace program.cs with sample code
  3. Modify POST url to match toilet
  4. Setup Wifi SSID & Password
  5. Download onto ESP32 device (I used a RAKWireless 11200)
  6. Once application connecting etc. should be able to see HTTP Post payloads at ptsv2.
  7. Flush all the dumps, and restart application.
  8. Wait
  9. After a while POST payloads will stopping arriving at ptsv2

I recorded sample of ptsv2 uploads from application start to stop.

ptsv220221004.txt

I wrote a quick 'n' dirty desktop console application to save debug output (from start to stop) to a file as TeraTerm/Arduino serial monitor/... could cope with some of the unprintable characters.

SerialOutput.bin.txt

Beware there are a lot of nulls and other unprintable characters in the .bin.txt file.

Sample code just rename and fill out SSID etc. Program.cs.txt

Expected behaviour

Should run for ever

Screenshots

No response

Aditional information

Every so often a couple of uploads are less than a minute apart which is odd.

The Debug.WriteLine's are just for checking that application connects on startup.

Had serial port logging but removed that to keep sample a small as possible, didn't see anything useful in output

Ellerbach commented 2 years ago

I only uses StartDeepSleep everywhere. No StartLightSleep. The reason is that the Deep sleep really reset all the memory and you start fresh like in a physical reboot. I have a device under a desk working properly for the last months without any issue. It does sleep a big longer than in your examples (it sleeps 2 minutes or so). Maybe you can try to adjust for it to sleep longer.

KiwiBryn commented 2 years ago

In the output file I can only see DEEPSLEEP_RESET does the ESP32 logging have a LIGHTSLEEP_RESET?

I originally had device sleeping for >10mins but took days for issue to occur.

In the binary file there are a few TG1WDT_SYS_RESET (I'm guessing those watchdog timeouts are where POSTs are too close together) and when device finally stops lots of SPI_FAST_FLASH_BOOT.

I had only used StartLight sleep when hen the Wifi failed to connect (hangover from previous experiment) will fix and run again.

KiwiBryn commented 2 years ago

Replaced StartLightSleep in Wifi connect failure handler with StartDeepSleep, ran the program again.

Whatever randomness causes this occurred quicker...

Program202210050931.cs.txt ptsv2202210050931.txt SerialOutput202210050931.bin.txt

Ellerbach commented 2 years ago

OK, so there is definitely something behind all this. And the device I'm using has been flashed a year ago or so! Maybe even more. And still working. Maybe I haven't reached the point of problem you describe neither. And maybe with various electricity cut or whatever it resets itself properly.

So I guess this would require a bit of investigation @josesimoes and @AdrianSoundy any idea how we could test that properly?

josesimoes commented 2 years ago

I still haven't looked into this. Just downloaded the output from IDF and the thing that caught my attention was the TG1WDT_SYS_RESET there. From the looks of it, it seems that the sleep and wake up are happening correctly (that can be seen from the DEEPSLEEP_RESET). What is messing things up seems to be something in the processing of the application. To be investigated...

AdrianSoundy commented 2 years ago

There shouldn't be any TG1WDT_SYS_RESET being seen. I think originally this was disabled, This may not be the case now with IDF 4 builds. Will have to check. Task running too long before yielding, which is difficult to control with nanoCLR.

Also check that the power supply is good as that can cause random watch dog issues.

KiwiBryn commented 1 year ago

Is it possible that this got was sorted as a side effect of another fix?

I had a device which had been running for a week and it hadn't stopped. It was sitting on window sill then cleaner turned solar panel over when dusting so battery went flat...

I'm going to shift it out to window sill in garage where no one will touch it.

@KiwiBryn

josesimoes commented 1 year ago

Yes. Or maybe it was fixed upstream as we had a couple of bumps in IDF versions since you've reported this.

(these are the best sort of issues: they solve themselves 😅)

Waiting on your confirmation.

KiwiBryn commented 1 year ago

My test application has been running for a month doing an HTTP POST then DeepSleep for 6 minutes.

It looks like the issue has been sorted by some other fix/update.

Will leave running for another month just to double check

@KiwiBryn

josesimoes commented 1 year ago

@KiwiBryn nice! Thanks for the feedback and help on keeping the issue list tidy. 😉