TheTorProject / lepidopter

lepidopter: raspberry pi image for conducting OONI network measurements
https://ooni.torproject.org/
GNU General Public License v3.0
47 stars 20 forks source link

Add a watchdog daemon to auto reboot a hung Raspberry PI #96

Closed anadahz closed 7 years ago

anadahz commented 7 years ago

Raspberry Pi comes with a hardware based watchdog timer that causes a reset after detecting a hanged out process. A watchdog service should be added in order to use the hardware based watchdog.

darkk commented 7 years ago

Also it's unclear to me what is correct watchdog-device setting. Seems /dev/watchdog is software watchdog and has major/minor == 10/130 and hw watchdog is 253/0.

anadahz commented 7 years ago

Using either /dev/watchdog or /dev/watchdog0 identifies the hardware watchdog of the Raspberry Pi, (Revision: a01041, Pi 2 Model B v1.1):

● watchdog.service - watchdog daemon
   Loaded: loaded (/lib/systemd/system/watchdog.service; static)
   Active: active (running)
  Process: 457 ExecStart=/bin/sh -c [ $run_watchdog != 1 ] || exec /usr/sbin/watchdog $watchdog_options (code=exited, status=0/SUCCESS)
  Process: 451 ExecStartPre=/bin/sh -c [ -z "${watchdog_module}" ] || [ "${watchdog_module}" = "none" ] || /sbin/modprobe $watchdog_module (code=exited, status=0/SUCCESS)
 Main PID: 461 (watchdog)
   CGroup: /system.slice/watchdog.service
           └─461 /usr/sbin/watchdog

lepidopter watchdog[461]: int=2s realtime=yes sync=no soft=no mla=0 mem=0
lepidopter watchdog[461]: ping: no machine to check
lepidopter watchdog[461]: file: no file to check
lepidopter watchdog[461]: pidfile: no server process to check
lepidopter watchdog[461]: interface: no interface to check
lepidopter watchdog[461]: temperature: no sensors to check
lepidopter watchdog[461]: test=none(0) repair=none(0) alive=/dev/watchdog heartbeat=none to=root no_act=no force=no
lepidopter watchdog[461]: watchdog now set to 10 seconds
lepidopter watchdog[461]: hardware watchdog identity: Broadcom BCM2835 Watchdog timer
lepidopter systemd[1]: Started watchdog daemon.
darkk commented 7 years ago

Ok, cool hardware watchdog identity: Broadcom BCM2835 Watchdog timer seems to be an answer. I've not seen that line in my logs :)

anadahz commented 7 years ago

This has been implemented in: https://github.com/TheTorProject/lepidopter/releases/tag/v1.0.0