kerberos-io / machinery

(DEPRECATED) An open source image processing framework, which uses your USB-, IP- or RPi-camera to recognize events (e.g. motion).
https://www.kerberos.io
489 stars 102 forks source link

Crash after configuration change, device left in blocking state (?), needs reboot #190

Open nopmop opened 5 years ago

nopmop commented 5 years ago

Description When I set the option to call a bash script after a motion event kerberosio keeps crashing (being respawned by systemd). Reverting back the config doesn't work (settings are not stored anymore), and running the process manually shows only the following:

[...]
25/02/2019 22:36:21.856 INFO  [trivial] Capture: Succesfully opened Raspberry Pi camera module.
25/02/2019 22:36:26.857 INFO  [trivial] Capture: Start streaming thread
25/02/2019 22:36:26.858 INFO  [trivial] Stream: Configured stream on port 8889 with quality: 75
25/02/2019 22:36:26.858 INFO  [trivial] Starting cloud service: S3
25/02/2019 22:36:26.886 INFO  [trivial] Starting conditions: Enabled
25/02/2019 22:36:26.887 INFO  [trivial] Starting algorithm: DifferentialCollins
25/02/2019 22:36:26.888 INFO  [trivial] Starting expositor: Hull
25/02/2019 22:36:27.761 INFO  [trivial] Starting heuristic: Sequence
25/02/2019 22:36:27.761 INFO  [trivial] Starting io devices: Video
25/02/2019 22:36:31.859 INFO  [trivial] Capture: devices is blocking, and not grabbing any more frames.
terminate called after throwing an instance of 'kerberos::KerberosCouldNotGrabFromCamera'
  what():  Kerberos : Capture : Could not grab from capture: devices is blocking, and not grabbin any more frames.
Aborted

After reboot & config revert things start to operate normally again. It looks like after the first crash (maybe happening for trivial reasons) a deadlock is happening because the camera is left in an unclean state.

To Reproduce Steps to reproduce the behavior:

  1. Go to Configuration -> Motion --- (click next screen * 2) ---> enable SCRIPT output

Expected behavior Crashes can happen, but a reboot should not be required to fix things, I guess.

Screenshots image

Operating System and Device:

Additional context

A strace shows the following:

[...]

[pid  2409] open("/etc/opt/kerberosio/symbols/", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 15
[pid  2409] fstat64(15, {st_mode=S_IFDIR|0777, st_size=4096, ...}) = 0
[pid  2409] getdents64(15, /* 2 entries */, 32768) = 48
[pid  2409] getdents64(15, /* 0 entries */, 32768) = 0
[pid  2409] close(15)                   = 0
[pid  2409] nanosleep({tv_sec=1, tv_nsec=500000000},  <unfinished ...>
[pid  2392] write(14, "\3676\254t\326\271\262\vtcBI\303\250\245\226\322]1\362J\356u\332\230\31\r\376'\332\254\331"..., 4096) = 4096
[pid  2392] write(14, "NZ+\235\321]\311-lY\201DA#/>i\253\321\332\317\252\333}\2322\335Aun\v}"..., 4096) = 4096
[pid  2392] write(14, "\320v\372\326\25[E\246\306;\313n\301\0262\3ch\307\31\364>\265>\234\261\220\f\222\365S\204"..., 1824) = 1824
[pid  2392] close(14)                   = 0
[pid  2392] inotify_init()              = 14
[pid  2392] inotify_add_watch(14, "/etc/opt/kerberosio/config", IN_CLOSE_WRITE|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE) = 1
[pid  2392] _newselect(15, [14], NULL, NULL, {tv_sec=0, tv_usec=0}) = 0 (Timeout)
[pid  2392] mmap2(NULL, 5763072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x656b9000
[pid  2404] <... nanosleep resumed> NULL) = 0
[pid  2404] nanosleep({tv_sec=0, tv_nsec=33300000},  <unfinished ...>
[pid  2406] <... nanosleep resumed> NULL) = 0
[pid  2406] _newselect(10, [9], NULL, NULL, {tv_sec=0, tv_usec=10}) = 0 (Timeout)
[pid  2406] nanosleep({tv_sec=0, tv_nsec=33333000},  <unfinished ...>
[pid  2392] mmap2(NULL, 1921024, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x6adbe000
[pid  2392] futex(0x251024c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x2510248, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1) = 1
[pid  2432] <... futex resumed> )       = 0
[pid  2392] futex(0x25102ac, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x25102a8, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1) = 1
[pid  2433] <... futex resumed> )       = 0
[pid  2392] futex(0x251030c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x2510308, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1) = 1
[pid  2392] futex(0x251036c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x2510368, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1) = 1
[pid  2392] futex(0x25103cc, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x25103c8, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1) = 1
[pid  2392] futex(0x251042c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x2510428, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1) = 1
[pid  2437] <... futex resumed> )       = 0
[pid  2392] futex(0x251048c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x2510488, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1) = 1
[pid  2438] <... futex resumed> )       = 0
[pid  2392] futex(0x251046c, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2438] futex(0x251046c, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  2392] <... futex resumed> )       = 0
[pid  2438] <... futex resumed> )       = -1 EAGAIN (Resource temporarily unavailable)
[pid  2392] futex(0x25104ec, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x25104e8, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1) = 1
[pid  2439] <... futex resumed> )       = 0
[pid  2436] <... futex resumed> )       = 0
[pid  2439] futex(0x25104cc, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  2435] <... futex resumed> )       = 0
[pid  2434] <... futex resumed> )       = 0
[pid  2392] futex(0x25104cc, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  2392] futex(0x250f99c, FUTEX_WAIT_PRIVATE, 11, NULL <unfinished ...>
[pid  2404] <... nanosleep resumed> NULL) = 0
[pid  2404] nanosleep({tv_sec=0, tv_nsec=33300000},  <unfinished ...>
[pid  2438] futex(0x251046c, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  2439] <... futex resumed> )       = -1 EAGAIN (Resource temporarily unavailable)
[pid  2437] futex(0x251040c, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2439] futex(0x25104cc, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  2439] futex(0x25104ec, FUTEX_WAIT_PRIVATE, 13, NULL <unfinished ...>
[pid  2438] futex(0x251048c, FUTEX_WAIT_PRIVATE, 13, NULL <unfinished ...>
[pid  2437] <... futex resumed> )       = 0
[pid  2436] futex(0x25103ac, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2437] futex(0x251042c, FUTEX_WAIT_PRIVATE, 13, NULL <unfinished ...>
[pid  2436] <... futex resumed> )       = 0
[pid  2436] futex(0x25103cc, FUTEX_WAIT_PRIVATE, 13, NULL <unfinished ...>
[pid  2435] futex(0x250f99c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x250f998, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1 <unfinished ...>
[pid  2434] futex(0x25102ec, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2392] <... futex resumed> )       = 0
[pid  2435] <... futex resumed> )       = 1
[pid  2392] futex(0x250f980, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  2435] futex(0x250f980, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2392] <... futex resumed> )       = -1 EAGAIN (Resource temporarily unavailable)
[pid  2435] <... futex resumed> )       = 0
[pid  2392] futex(0x250f980, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2434] <... futex resumed> )       = 0
[pid  2392] <... futex resumed> )       = 0
[pid  2392] munmap(0x656b9000, 5763072 <unfinished ...>
[pid  2435] futex(0x251034c, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2434] futex(0x251030c, FUTEX_WAIT_PRIVATE, 13, NULL <unfinished ...>
[pid  2433] futex(0x251028c, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2432] futex(0x251022c, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2435] <... futex resumed> )       = 0
[pid  2433] <... futex resumed> )       = 0
[pid  2435] futex(0x251036c, FUTEX_WAIT_PRIVATE, 13, NULL <unfinished ...>
[pid  2433] futex(0x25102ac, FUTEX_WAIT_PRIVATE, 13, NULL <unfinished ...>
[pid  2432] <... futex resumed> )       = 0
[pid  2406] <... nanosleep resumed> NULL) = 0
[pid  2432] futex(0x251024c, FUTEX_WAIT_PRIVATE, 13, NULL <unfinished ...>
[pid  2406] _newselect(10, [9], NULL, NULL, {tv_sec=0, tv_usec=10} <unfinished ...>
[pid  2392] <... munmap resumed> )      = 0
[pid  2406] <... _newselect resumed> )  = 0 (Timeout)
[pid  2392] mmap2(NULL, 1921024, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
[pid  2406] nanosleep({tv_sec=0, tv_nsec=33333000},  <unfinished ...>
[pid  2392] <... mmap2 resumed> )       = 0x65a63000
[pid  2392] mmap2(NULL, 1921024, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x6588e000
[pid  2404] <... nanosleep resumed> NULL) = 0
[pid  2404] nanosleep({tv_sec=0, tv_nsec=33300000},  <unfinished ...>
[pid  2392] mmap2(NULL, 1921024, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x656b9000
[pid  2406] <... nanosleep resumed> NULL) = 0
[pid  2406] _newselect(10, [9], NULL, NULL, {tv_sec=0, tv_usec=10}) = 0 (Timeout)
[pid  2406] nanosleep({tv_sec=0, tv_nsec=33333000},  <unfinished ...>
[pid  2392] futex(0x251024c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x2510248, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1) = 1
[pid  2432] <... futex resumed> )       = 0
[pid  2392] futex(0x25102ac, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x25102a8, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1) = 1
[pid  2392] futex(0x251030c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x2510308, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1) = 1
[pid  2434] <... futex resumed> )       = 0
[pid  2392] futex(0x251036c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x2510368, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1) = 1
[pid  2392] futex(0x25103cc, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x25103c8, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1 <unfinished ...>
[pid  2435] <... futex resumed> )       = 0
[pid  2392] <... futex resumed> )       = 1
[pid  2392] futex(0x251042c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x2510428, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1 <unfinished ...>
[pid  2436] <... futex resumed> )       = 0
[pid  2392] <... futex resumed> )       = 1
[pid  2437] <... futex resumed> )       = 0
[pid  2392] futex(0x251048c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x2510488, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1) = 1
[pid  2438] <... futex resumed> )       = 0
[pid  2392] futex(0x25104ec, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x25104e8, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1 <unfinished ...>
[pid  2438] futex(0x251046c, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2392] <... futex resumed> )       = 1
[pid  2439] <... futex resumed> )       = 0
[pid  2392] futex(0x250f99c, FUTEX_WAIT_PRIVATE, 13, NULL <unfinished ...>
[pid  2439] futex(0x25104cc, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2438] <... futex resumed> )       = 0
[pid  2439] <... futex resumed> )       = 0
[pid  2439] futex(0x25104ec, FUTEX_WAIT_PRIVATE, 15, NULL <unfinished ...>
[pid  2438] futex(0x251048c, FUTEX_WAIT_PRIVATE, 15, NULL <unfinished ...>
[pid  2437] futex(0x251040c, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  2436] futex(0x25103ac, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2437] futex(0x251042c, FUTEX_WAIT_PRIVATE, 15, NULL <unfinished ...>
[pid  2436] <... futex resumed> )       = 0
[pid  2435] futex(0x251034c, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2436] futex(0x25103cc, FUTEX_WAIT_PRIVATE, 15, NULL <unfinished ...>
[pid  2435] <... futex resumed> )       = 0
[pid  2435] futex(0x251036c, FUTEX_WAIT_PRIVATE, 15, NULL <unfinished ...>
[pid  2434] futex(0x25102ec, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2433] <... futex resumed> )       = 0
[pid  2434] <... futex resumed> )       = 0
[pid  2433] futex(0x250f99c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x250f998, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1 <unfinished ...>
[pid  2392] <... futex resumed> )       = 0
[pid  2434] futex(0x251030c, FUTEX_WAIT_PRIVATE, 15, NULL <unfinished ...>
[pid  2392] futex(0x250f980, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  2433] <... futex resumed> )       = 1
[pid  2433] futex(0x250f980, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2432] futex(0x251022c, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2392] <... futex resumed> )       = 0
[pid  2392] futex(0x250f980, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2433] <... futex resumed> )       = 1
[pid  2392] <... futex resumed> )       = 0
[pid  2433] futex(0x251028c, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  2432] <... futex resumed> )       = 0
[pid  2433] <... futex resumed> )       = 0
[pid  2432] futex(0x251024c, FUTEX_WAIT_PRIVATE, 15, NULL <unfinished ...>
[pid  2433] futex(0x25102ac, FUTEX_WAIT_PRIVATE, 15, NULL <unfinished ...>
[pid  2404] <... nanosleep resumed> NULL) = 0
[pid  2404] nanosleep({tv_sec=0, tv_nsec=33300000},  <unfinished ...>
[pid  2406] <... nanosleep resumed> NULL) = 0
[pid  2406] _newselect(10, [9], NULL, NULL, {tv_sec=0, tv_usec=10}) = 0 (Timeout)
[pid  2406] nanosleep({tv_sec=0, tv_nsec=33333000},  <unfinished ...>
[pid  2404] <... nanosleep resumed> NULL) = 0
[pid  2404] nanosleep({tv_sec=0, tv_nsec=33300000},  <unfinished ...>
[pid  2406] <... nanosleep resumed> NULL) = 0
[pid  2406] _newselect(10, [9], NULL, NULL, {tv_sec=0, tv_usec=10}) = 0 (Timeout)
[pid  2406] nanosleep({tv_sec=0, tv_nsec=33333000},  <unfinished ...>
[pid  2404] <... nanosleep resumed> NULL) = 0
[pid  2404] nanosleep({tv_sec=0, tv_nsec=33300000},  <unfinished ...>
[pid  2431] <... nanosleep resumed> NULL) = 0
[pid  2431] nanosleep({tv_sec=0, tv_nsec=500000000},  <unfinished ...>
[pid  2406] <... nanosleep resumed> NULL) = 0
[pid  2406] _newselect(10, [9], NULL, NULL, {tv_sec=0, tv_usec=10}) = 0 (Timeout)
[pid  2406] nanosleep({tv_sec=0, tv_nsec=33333000},  <unfinished ...>
[pid  2404] <... nanosleep resumed> NULL) = 0
[pid  2404] nanosleep({tv_sec=0, tv_nsec=33300000},  <unfinished ...>
[pid  2406] <... nanosleep resumed> NULL) = 0
[pid  2406] _newselect(10, [9], NULL, NULL, {tv_sec=0, tv_usec=10}) = 0 (Timeout)
[pid  2406] nanosleep({tv_sec=0, tv_nsec=33333000},  <unfinished ...>
[pid  2404] <... nanosleep resumed> NULL) = 0
[pid  2404] nanosleep({tv_sec=0, tv_nsec=33300000},  <unfinished ...>
[pid  2406] <... nanosleep resumed> NULL) = 0
[pid  2406] _newselect(10, [9], NULL, NULL, {tv_sec=0, tv_usec=10}) = 0 (Timeout)
[pid  2406] nanosleep({tv_sec=0, tv_nsec=33333000},  <unfinished ...>
[pid  2404] <... nanosleep resumed> NULL) = 0
[pid  2404] nanosleep({tv_sec=0, tv_nsec=33300000},  <unfinished ...>
[pid  2406] <... nanosleep resumed> NULL) = 0
[pid  2406] _newselect(10, [9], NULL, NULL, {tv_sec=0, tv_usec=10}) = 0 (Timeout)
[pid  2406] nanosleep({tv_sec=0, tv_nsec=33333000},  <unfinished ...>
[pid  2404] <... nanosleep resumed> NULL) = 0
[pid  2404] nanosleep({tv_sec=0, tv_nsec=33300000},  <unfinished ...>
[pid  2406] <... nanosleep resumed> NULL) = 0
[pid  2406] _newselect(10, [9], NULL, NULL, {tv_sec=0, tv_usec=10}) = 0 (Timeout)
[pid  2406] nanosleep({tv_sec=0, tv_nsec=33333000},  <unfinished ...>
[pid  2404] <... nanosleep resumed> NULL) = 0
[pid  2404] nanosleep({tv_sec=0, tv_nsec=33300000},  <unfinished ...>
[pid  2392] nanosleep({tv_sec=1, tv_nsec=0},  <unfinished ...>
[pid  2406] <... nanosleep resumed> NULL) = 0
[pid  2406] _newselect(10, [9], NULL, NULL, {tv_sec=0, tv_usec=10}) = 0 (Timeout)
[pid  2406] nanosleep({tv_sec=0, tv_nsec=33333000},  <unfinished ...>
[pid  2404] <... nanosleep resumed> NULL) = 0
[pid  2404] nanosleep({tv_sec=0, tv_nsec=33300000},  <unfinished ...>
[pid  2406] <... nanosleep resumed> NULL) = 0
[pid  2406] _newselect(10, [9], NULL, NULL, {tv_sec=0, tv_usec=10}) = 0 (Timeout)
[pid  2406] nanosleep({tv_sec=0, tv_nsec=33333000},  <unfinished ...>
[pid  2404] <... nanosleep resumed> NULL) = 0
[pid  2404] nanosleep({tv_sec=0, tv_nsec=33300000},  <unfinished ...>
[pid  2405] <... nanosleep resumed> NULL) = 0
[pid  2405] _llseek(4, 0, [275124], SEEK_CUR) = 0
[pid  2405] _llseek(4, 0, [275124], SEEK_END) = 0
[pid  2405] _llseek(4, 0, [275124], SEEK_CUR) = 0
[pid  2405] _llseek(4, 275124, [275124], SEEK_SET) = 0
[pid  2405] stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2692, ...}) = 0
[pid  2405] write(4, "25/02/2019 22:32:00.003 INFO  [t"..., 104) = 104
[pid  2405] _llseek(4, 0, [275228], SEEK_CUR) = 0
[pid  2405] _llseek(5, 275228, [275228], SEEK_SET) = 0
[pid  2405] _llseek(4, 0, [275228], SEEK_CUR) = 0
[pid  2405] _llseek(6, 275228, [275228], SEEK_SET) = 0
[pid  2405] _llseek(4, 0, [275228], SEEK_CUR) = 0
[pid  2405] _llseek(7, 275228, [275228], SEEK_SET) = 0
[pid  2405] write(1, "25/02/2019 22:32:00.003 INFO  [t"..., 10425/02/2019 22:32:00.003 INFO  [trivial] Capture: devices is blocking, and not grabbing any more frames.
) = 104
[pid  2405] write(2, "terminate called after throwing "..., 48terminate called after throwing an instance of ') = 48
[pid  2405] write(2, "kerberos::KerberosCouldNotGrabFr"..., 40kerberos::KerberosCouldNotGrabFromCamera) = 40
[pid  2405] write(2, "'\n", 2'
)          = 2
[pid  2405] write(2, "  what():  ", 11  what():  ) = 11
[pid  2405] write(2, "Kerberos : Capture : Could not g"..., 103Kerberos : Capture : Could not grab from capture: devices is blocking, and not grabbin any more frames.) = 103
[pid  2405] write(2, "\n", 1
)           = 1
[pid  2405] rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
[pid  2405] rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
[pid  2405] getpid()                    = 2392
[pid  2405] gettid()                    = 2405
[pid  2405] tgkill(2392, 2405, SIGABRT) = 0
[pid  2405] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid  2405] --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=2392, si_uid=0} ---
[pid  2438] <... futex resumed>)        = ?
[pid  2437] <... futex resumed>)        = ?
[pid  2434] <... futex resumed>)        = ?
[pid  2392] <... nanosleep resumed> <unfinished ...>) = ?
[pid  2439] <... futex resumed>)        = ?
[pid  2435] <... futex resumed>)        = ?
[pid  2436] <... futex resumed>)        = ?
[pid  2433] <... futex resumed>)        = ?
[pid  2432] <... futex resumed>)        = ?
[pid  2431] <... nanosleep resumed> <unfinished ...>) = ?
[pid  2430] <... nanosleep resumed> <unfinished ...>) = ?
[pid  2414] <... nanosleep resumed> <unfinished ...>) = ?
[pid  2413] <... pselect6 resumed> <unfinished ...>) = ?
[pid  2409] <... nanosleep resumed> <unfinished ...>) = ?
[pid  2408] <... nanosleep resumed> <unfinished ...>) = ?
[pid  2406] <... nanosleep resumed> <unfinished ...>) = ?
[pid  2404] <... nanosleep resumed> <unfinished ...>) = ?
[pid  2403] <... futex resumed>)        = ?
[pid  2402] <... futex resumed>)        = ?
[pid  2401] <... futex resumed>)        = ?
[pid  2400] <... futex resumed>)        = ?
[pid  2399] <... futex resumed>)        = ?
[pid  2398] <... futex resumed>)        = ?
[pid  2397] <... futex resumed>)        = ?
[pid  2396] <... ioctl resumed> <unfinished ...>) = ?
[pid  2438] +++ killed by SIGABRT +++
[pid  2434] +++ killed by SIGABRT +++
[pid  2437] +++ killed by SIGABRT +++
[pid  2439] +++ killed by SIGABRT +++
[pid  2435] +++ killed by SIGABRT +++
[pid  2433] +++ killed by SIGABRT +++
[pid  2432] +++ killed by SIGABRT +++
[pid  2436] +++ killed by SIGABRT +++
[pid  2431] +++ killed by SIGABRT +++
[pid  2430] +++ killed by SIGABRT +++
[pid  2414] +++ killed by SIGABRT +++
[pid  2413] +++ killed by SIGABRT +++
[pid  2409] +++ killed by SIGABRT +++
[pid  2408] +++ killed by SIGABRT +++
[pid  2406] +++ killed by SIGABRT +++
[pid  2404] +++ killed by SIGABRT +++
[pid  2403] +++ killed by SIGABRT +++
[pid  2402] +++ killed by SIGABRT +++
[pid  2401] +++ killed by SIGABRT +++
[pid  2400] +++ killed by SIGABRT +++
[pid  2399] +++ killed by SIGABRT +++
[pid  2398] +++ killed by SIGABRT +++
[pid  2397] +++ killed by SIGABRT +++
[pid  2396] +++ killed by SIGABRT +++
[pid  2405] +++ killed by SIGABRT +++
+++ killed by SIGABRT +++
ChieftainY2k commented 4 years ago

It seems I have exactly same problem with kerberos machinery in docker container.

Restarting container does not help, the only thing that works is system reboot.

After the configuration update I get:

kerberos_1            | [OMX.broadcom.video_encode]Wating state to be 3
kerberos_1            | 21/04/2020 10:12:59.509 INFO  [trivial] Capture: Succesfully opened Raspberry Pi camera module.
kerberos_1            | 21/04/2020 10:13:04.510 INFO  [trivial] Capture: Start streaming thread
kerberos_1            | 21/04/2020 10:13:04.511 INFO  [trivial] Stream: Configured stream on port 8889 with quality: 75
kerberos_1            | 21/04/2020 10:13:04.511 INFO  [trivial] Starting cloud service: S3
kerberos_1            | 21/04/2020 10:13:04.528 INFO  [trivial] Starting conditions: Enabled
kerberos_1            | 21/04/2020 10:13:04.529 INFO  [trivial] Starting algorithm: DifferentialCollins
kerberos_1            | 21/04/2020 10:13:04.529 INFO  [trivial] Starting expositor: Hull
kerberos_1            | 21/04/2020 10:13:04.923 INFO  [trivial] Starting heuristic: Sequence
kerberos_1            | 21/04/2020 10:13:04.924 INFO  [trivial] Starting io devices: Video,Disk,MQTT
kerberos_1            | 21/04/2020 10:13:05.833 INFO  [trivial] HullExpositor: activity detected from (9,8) to (208,45)
kerberos_1            | 21/04/2020 10:13:09.512 INFO  [trivial] Capture: devices is blocking, and not grabbing any more frames.
kerberos_1            | /container-runner.sh: line 96: 32701 Aborted                 (core dumped) kerberosio

System info:

[2020-04-21 10:16:06][garda] Hardware: Raspberry Pi 3 Model B Plus Rev 1.3
[2020-04-21 10:16:06][garda] Kernel: Linux camback 4.19.66-v7+ #1253 SMP Thu Aug 15 11:49:46 BST 2019 armv7l GNU/Linux
[2020-04-21 10:16:06][garda] OS: PRETTY_NAME="Raspbian GNU/Linux 9 (stretch)"
[2020-04-21 10:16:06][garda] Raspberry version for kerberos: 3

Steps to reproduce:

  1. Run kerberos in a container
  2. Update configuration with web interface
cedricve commented 4 years ago

@ChieftainY2k You didnt had this with the 2.6.1?

ChieftainY2k commented 4 years ago

@cedricve Indeed I did have it but I thought it was some system misconfiguration on my part.
After thorough testing I did yesterday it's present on RPI2/3/4 (I was able to test it with RPI4 which I got yesterday , significant upgrade, yay! :-) )

However it does not happen every time (usually after 3rd or 4th configuration update), strange....

@ChieftainY2k You didnt had this with the 2.6.1?

cedricve commented 4 years ago

@cedricve Indeed I did have it but I thought it was some system misconfiguration on my part. After thorough testing I did yesterday it's present on RPI2/3/4 (I was able to test it with RPI4 which I got yesterday , significant upgrade, yay! :-) )

However it does not happen every time, strange....

@ChieftainY2k You didnt had this with the 2.6.1?

Cool ;) Does it occur when you dont use the script function?

ChieftainY2k commented 4 years ago

Cool ;) Does it occur when you dont use the script function?

Unfortunately yes, it does :( here's the config I use:

kerberos.zip

ChieftainY2k commented 4 years ago

@cedricve I can open up ssh via the ngrok tunnel for you to look inside the kerberos container to fiddle around and check it out if you want :)

cedricve commented 4 years ago

thanks, just wondering if you are using latest version of the Raspbian OS?

ChieftainY2k commented 4 years ago

I'm using the newest Raspbian Buster with the latest apt-get updates.

thanks, just wondering if you are using latest version of the Raspbian OS?

ChieftainY2k commented 4 years ago

May be unrelated but sometimes this appears in the /var/log/kern.log (machinery in docker container)

Apr 21 15:15:14 raspberry4 kernel: [ 7739.571559] INFO: task kerberosio:16407 blocked for more than 120 seconds.
Apr 21 15:15:14 raspberry4 kernel: [ 7739.571570] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 21 15:15:14 raspberry4 kernel: [ 7739.571575] kerberosio      D    0 16407   2501 0x00000004
Apr 21 15:15:14 raspberry4 kernel: [ 7739.571606] [<c09c95cc>] (schedule) from [<c09cd5b4>] (schedule_timeout+0x200/0x428)
Apr 21 15:15:14 raspberry4 kernel: [ 7739.571622] [<c09cbb48>] (__down) from [<c0278cd0>] (down+0x54/0x68)
Apr 21 15:15:14 raspberry4 kernel: [ 7739.571643] [<c086d590>] (vchiq_release) from [<c03d66fc>] (__fput+0xa4/0x1e8)
Apr 21 15:15:14 raspberry4 kernel: [ 7739.571661] [<c03d68b0>] (____fput) from [<c02426dc>] (task_work_run+0xbc/0xe0)
Apr 21 15:15:14 raspberry4 kernel: [ 7739.571679] [<c0226bd0>] (do_exit) from [<c0227434>] (do_group_exit+0x4c/0xe4)
Apr 21 15:15:14 raspberry4 kernel: [ 7739.571695] [<c0233780>] (get_signal) from [<c020ca1c>] (do_signal+0x364/0x474)
Apr 21 15:15:14 raspberry4 kernel: [ 7739.571709] [<c020cd38>] (do_work_pending) from [<c0201034>] (slow_work_pending+0xc/0x20)
Apr 21 15:15:14 raspberry4 kernel: [ 7739.571718] ffa0:                                     aa434b40 00000000 00000002 00000000
Apr 21 15:15:14 raspberry4 kernel: [ 7739.571724] ffc0: 00000018 00000000 00000000 000000a2 03b7a338 b0680860 aa434ca8 b06808d8
Apr 21 15:15:14 raspberry4 kernel: [ 7739.571729] ffe0: 00000000 aa434b38 00000000 b633b030 80000010 aa434b40
Apr 21 15:15:14 raspberry4 kernel: [ 7739.571736] INFO: task kworker/2:1:14577 blocked for more than 120 seconds.