letscontrolit / ESPEasy

Easy MultiSensor device based on ESP8266/ESP32
http://www.espeasy.com
Other
3.27k stars 2.21k forks source link

Feature Request: Using the RTC memory for telemetry / crash data #1839

Open s0170071 opened 6 years ago

s0170071 commented 6 years ago

As far as I know the RTC memory currently only used for keeping track of the flash cycles per day. But it can do much more. Debugging for example

Howto: Using the RTC memory:

https://www.youtube.com/watch?v=r-hEOL007nw Its 512 bytes arranged in multiples of 4. Survives DeepSleep and Reset, Read/Write as often as you like. Beware: if you write it you have to write it all at once. You must not update individual bytes.

as soon as I find the time, I will play with it to implement the new %previousUptime% system variable. It will hold the uptime before the latest reset/crash. Also a lot of other variables could be stored in here to survive a crash or so.

My suggestion for the RTC memory would be :

  1. Uptime1
  2. Uptime2
  3. Uptime3
  4. Uptime4
  5. UptimePointer
  6. freeHeap
  7. freeStack 8-18. CallStack (maybe only when debug output is enabled, requires a exception decoder compatible output)
  8. WifiEvent
  9. @TD-er indicated some other WiFi data would be interesting
  10. ...

If an email service is configured, that could allow for a small real-time crash-report.

It can also hold measuring values so that the ESP waking up from deep sleep does not necessarily need to transmit values after each wake up.

If all four (previous) uptimes are <10s, that could indicate a boot loop and trigger a start with a default configuration.

You see, lots of possibilities.

As a first step I would suggest so collect your thoughts on the matter and maybe assemble a conclusive list of data that should survive a reset / deepsleep and what to do with it on reboot / wakeup.

TD-er commented 6 years ago

It is currently also used to keep the last values of plugins.

The ones you mention are all rather large (16 - 32 bit) Also you may want to have a properly aligned struct to loose as little storage as possible.

An example of bad design:

This may take up-to 6x 32 bits of storage. To get an idea, please have a look at this part, I recently added to the SettingsStruct:

  // VariousBits1 defaults to 0, keep in mind when adding bit lookups.
  bool appendUnitToHostname() {  return !getBitFromUL(VariousBits1, 1); }
  void appendUnitToHostname(bool value) { setBitToUL(VariousBits1, 1, !value); }

See also https://github.com/letscontrolit/ESPEasy/issues/1597

TD-er commented 6 years ago

For current implementation, see functions in Misc.ino:

And RTCStruct in ESPEasy-globals.h

It is currently used to keep track of factoryResetCounter, flashCounter and flashDayCounter (to protect flash) And also bootFailedCount and bootCounter which are used to get an idea of bootloops and start disabling plugins/controllers/notifications one-by-one to be able to boot again.

I don't know why that struct is limited to 40 bytes, but it looks like about 15 bytes is currently used (may be more due to bad alignment) It starts at RTC_BASE_STRUCT (= 64 4). The user vars are stored at RTC_BASE_USERVAR (= 74 4) and that is defined as:

float UserVar[VARS_PER_TASK * TASKS_MAX];

=> 412 4 bytes + 4 for checksum So at "address" (74 + 49) *4 should be some room left for new data.

On the other hand, this is some kind of cache, which will be lost and re-init after power loss. So we're not really bound to current layout definitions. As long as we're not using parts already used by the core libraries.

s0170071 commented 6 years ago

Some more info to consider (so it does not get lost): There is a custom_crash_callback used by core_esp8266_postmortem.c Somebody uses this to save the stack and email it on reboot. You could also toggle a gpio if you like. There is also a library .