monome / norns

norns is many sound instruments.
http://monome.org
GNU General Public License v3.0
630 stars 145 forks source link

watchdog process? #1266

Closed catfact closed 3 years ago

catfact commented 3 years ago

a couple of points related to crash report / recovery:

(1) the K1+K2+K3 combo reset doesn't work when matron is crashed (2) some way of getting logs or backtraces would be helpful

--

(1) is insoluble as long as that watchdog routine is in matron (2) if we just revert ws-wrapper to duplicate stdout in services, we also get dupe output in some cases where we don't want it, and we create a lot of logging traffic that feeds into other issues with disk space and whatever. (and in any case, the stdout of matron is rarely helpful in diagnosing a crash, as opposed to just confirming that it happened.)


so: would it be crazy to introduce a new process that independently listens to GPIO (and/or socket) to restart other things when needed?

it also seems feasible that this thing could know the PIDs of the main processes and use a combo of core dump and ptrace (or something?) to extract backtrace after crash.

tehn commented 3 years ago

i'd actually considered knob/encoder collection as a good candidate for a separate process--- your idea is good. what do you think the best interface would be to get this data to matron? just a socket?

i like the idea of this thing also doing some monitoring, if there isn't a linuxy way of doing it already (which is beyond my knowledge)

catfact commented 3 years ago

i didn't mean collecting GPIO and forwarding it, just monitoring K GPIO in parallel

tehn commented 3 years ago

makes sense, i don't know why i thought there'd be a lock or something

On Wed, Dec 9, 2020 at 3:41 PM ezra buchla notifications@github.com wrote:

i didn't mean collecting GPIO and forwarding it, just monitoring K GPIO in parallel

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/monome/norns/issues/1266#issuecomment-742036394, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB4I4BHIXTHLBDKN2HY5XTST7OINANCNFSM4UUAWDMA .

tehn commented 3 years ago

making progress, i will take the lead on this

catfact commented 3 years ago

watchdog process exists as of https://github.com/monome/norns/pull/1020

more specific concerns can open new issues