Closed grewhit25 closed 4 years ago
Hi! Thanks for your report. This appears to be a crash in https://github.com/stanford-oval/node-pulseaudio/ . I will investigate.
By the way, if you could obtain a backtrace of the crashing node process, that would be really helpful.
To do so, you should install debug symbols for PulseAudio and related libraries (sudo apt install libpulse-dbg
IIRC), then run Almond inside gdb. Use:
gdb --args node ./main.js
Type run
to start the process, and let it run. When it crashes, gdb will return to a prompt, and you can obtain a backtrace with thread apply bt full
.
Thank you!
@gcampax Thanks for your reply. The only problem though is that I'm running almond in podman container, which makes setting up debugging a little bit complicated.
For completeness my environment is:
Ubuntu 19.10 (Eoan Ermine) almond-server 1.7.2 fuse-overlayfs podman-composer version 0.1.6dev podman version 1.6.2
Container image created from the official Dockerfile with Fedora 30.
I was able to modify the container to allow me to exec into it and run gdb manually. However, libpulse-dbg as requested is not available under Fedora so I was not able capture any meaningful debug info this failure but it proves that we can debug from within the container.
Here is the list of available symbols we can choose from to gather more detailed backtrace.
Running gdb --args node ./main.js from within the container produced the following output on failure.
Assertion 're->data || re->memblock' failed at pulsecore/pstream.c:862, function do_read(). Aborting.
Thread 1 "node" received signal SIGABRT, Aborted.
0x0000fffff5fb9bd0 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install alsa-lib-1.2.1.2-3.fc30.aarch64 dbus-libs-1.12.16-1.fc30.aarch64
flac-libs-1.3.2-10.fc30.aarch64 gsm-1.0.18-4.fc30.aarch64 http-parser-2.9.2-1.fc30.aarch64 libICE-1.0.9-15.fc30.aarch64
libSM-1.2.3-2.fc30.aarch64 libX11-1.6.7-1.fc30.aarch64 libX11-xcb-1.6.7-1.fc30.aarch64 libXau-1.0.9-1.fc30.aarch64
libXext-1.3.3-11.fc30.aarch64 libXi-1.7.10-1.fc30.aarch64 libXtst-1.2.3-9.fc30.aarch64 libasyncns-0.8-16.fc30.aarch64
libcanberra-0.30-19.fc30.aarch64 libcap-2.26-5.fc30.aarch64 libgcc-9.2.1-1.fc30.aarch64 libgcrypt-1.8.5-1.fc30.aarch64
libgpg-error-1.33-2.fc30.aarch64 libicu-63.2-2.fc30.aarch64 libnghttp2-1.39.2-1.fc30.aarch64 libogg-1.3.3-2.fc30.aarch64
libsndfile-1.0.28-10.fc30.aarch64 libstdc++-9.2.1-1.fc30.aarch64 libtdb-1.3.18-1.fc30.aarch64 libtool-ltdl-2.4.6-29.fc30.aarch64
libuuid-2.33.2-2.fc30.aarch64 libuv-1.34.0-1.fc30.aarch64 libvorbis-1.3.6-4.fc30.aarch64 libxcb-1.13.1-2.fc30.aarch64
libxcrypt-4.4.10-1.fc30.aarch64 lz4-libs-1.9.1-1.fc30.aarch64 mimic-1.2.0.2-9.fc30.aarch64 openssl-libs-1.1.1d-2.fc30.aarch64
pulseaudio-libs-12.2-9.fc30.aarch64 systemd-libs-241-12.git323cdf4.fc30.aarch64 xz-libs-5.2.4-5.fc30.aarch64 zlib-1.2.11-19.fc30.aarch6
(gdb)
(gdb)
(gdb) set logging file /var/lib/almond-server/pulsecore_backtrace.log
(gdb) set logging on
Copying output to /var/lib/almond-server/pulsecore_backtrace.log.
(gdb) thread apply all bt full
Thread 12 (Thread 0xffffc1bcd1b0 (LWP 101)):
#0 0x0000fffff61097f0 in pthread_cond_wait@@GLIBC_2.17 () from /lib64/libpthread.so.0
No symbol table info available.
#1 0x0000fffff5f110b4 in uv_cond_wait () from /lib64/libuv.so.1
No symbol table info available.
#2 0x0000fffff5f00640 in worker () from /lib64/libuv.so.1
No symbol table info available.
#3 0x0000fffff610379c in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4 0x0000fffff605699c in thread_start () from /lib64/libc.so.6
No symbol table info available.
Thread 11 (Thread 0xffffc23ce1b0 (LWP 100)):
#0 0x0000fffff61097f0 in pthread_cond_wait@@GLIBC_2.17 () from /lib64/libpthread.so.0
No symbol table info available.
#1 0x0000fffff5f110b4 in uv_cond_wait () from /lib64/libuv.so.1
No symbol table info available.
#2 0x0000fffff5f00640 in worker () from /lib64/libuv.so.1
No symbol table info available.
#3 0x0000fffff610379c in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4 0x0000fffff605699c in thread_start () from /lib64/libc.so.6
No symbol table info available.
Thread 10 (Thread 0xffffdcf7c1b0 (LWP 99)):
#0 0x0000fffff61097f0 in pthread_cond_wait@@GLIBC_2.17 () from /lib64/libpthread.so.0
No symbol table info available.
#1 0x0000fffff5f110b4 in uv_cond_wait () from /lib64/libuv.so.1
No symbol table info available.
#2 0x0000fffff5f00640 in worker () from /lib64/libuv.so.1
No symbol table info available.
--Type <RET> for more, q to quit, c to continue without paging--c
#3 0x0000fffff610379c in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4 0x0000fffff605699c in thread_start () from /lib64/libc.so.6
No symbol table info available.
Thread 9 (Thread 0xffffdd77d1b0 (LWP 98)):
#0 0x0000fffff61097f0 in pthread_cond_wait@@GLIBC_2.17 () from /lib64/libpthread.so.0
No symbol table info available.
#1 0x0000fffff5f110b4 in uv_cond_wait () from /lib64/libuv.so.1
No symbol table info available.
#2 0x0000fffff5f00640 in worker () from /lib64/libuv.so.1
No symbol table info available.
#3 0x0000fffff610379c in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4 0x0000fffff605699c in thread_start () from /lib64/libc.so.6
No symbol table info available.
Thread 8 (Thread 0xffffddf7e1b0 (LWP 97)):
#0 0x0000fffff604d7c4 in poll () from /lib64/libc.so.6
No symbol table info available.
#1 0x0000ffffee951180 in ?? () from /lib64/libpulse.so.0
No symbol table info available.
#2 0x0000ffffee943470 in pa_mainloop_poll () from /lib64/libpulse.so.0
No symbol table info available.
#3 0x0000ffffee943b3c in pa_mainloop_iterate () from /lib64/libpulse.so.0
No symbol table info available.
#4 0x0000ffffee943c08 in pa_mainloop_run () from /lib64/libpulse.so.0
No symbol table info available.
#5 0x0000ffffee9510d0 in ?? () from /lib64/libpulse.so.0
No symbol table info available.
#6 0x0000ffffee2c4804 in ?? () from /usr/lib64/pulseaudio/libpulsecommon-12.2.so
No symbol table info available.
#7 0x0000fffff610379c in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#8 0x0000fffff605699c in thread_start () from /lib64/libc.so.6
No symbol table info available.
Thread 7 (Thread 0xfffff40301b0 (LWP 95)):
#0 0x0000fffff610c0d0 in do_futex_wait.constprop () from /lib64/libpthread.so.0
No symbol table info available.
#1 0x0000fffff610c200 in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
No symbol table info available.
#2 0x0000fffff5f11144 in uv_sem_wait () from /lib64/libuv.so.1
No symbol table info available.
#3 0x0000fffff69266f8 in ?? () from /lib64/libnode.so.64
No symbol table info available.
#4 0x0000fffff610379c in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#5 0x0000fffff605699c in thread_start () from /lib64/libc.so.6
No symbol table info available.
Thread 6 (Thread 0xffffef7fe1b0 (LWP 94)):
#0 0x0000fffff61097f0 in pthread_cond_wait@@GLIBC_2.17 () from /lib64/libpthread.so.0
No symbol table info available.
#1 0x0000fffff5f110b4 in uv_cond_wait () from /lib64/libuv.so.1
No symbol table info available.
#2 0x0000fffff68c915c in ?? () from /lib64/libnode.so.64
No symbol table info available.
#3 0x0000fffff610379c in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4 0x0000fffff605699c in thread_start () from /lib64/libc.so.6
No symbol table info available.
Thread 5 (Thread 0xffffeffff1b0 (LWP 93)):
#0 0x0000fffff61097f0 in pthread_cond_wait@@GLIBC_2.17 () from /lib64/libpthread.so.0
No symbol table info available.
#1 0x0000fffff5f110b4 in uv_cond_wait () from /lib64/libuv.so.1
No symbol table info available.
#2 0x0000fffff68c915c in ?? () from /lib64/libnode.so.64
No symbol table info available.
#3 0x0000fffff610379c in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4 0x0000fffff605699c in thread_start () from /lib64/libc.so.6
No symbol table info available.
Thread 4 (Thread 0xfffff48311b0 (LWP 92)):
#0 0x0000fffff61097f0 in pthread_cond_wait@@GLIBC_2.17 () from /lib64/libpthread.so.0
No symbol table info available.
#1 0x0000fffff5f110b4 in uv_cond_wait () from /lib64/libuv.so.1
No symbol table info available.
#2 0x0000fffff68c915c in ?? () from /lib64/libnode.so.64
No symbol table info available.
#3 0x0000fffff610379c in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4 0x0000fffff605699c in thread_start () from /lib64/libc.so.6
No symbol table info available.
Thread 3 (Thread 0xfffff50321b0 (LWP 91)):
#0 0x0000fffff61097f0 in pthread_cond_wait@@GLIBC_2.17 () from /lib64/libpthread.so.0
No symbol table info available.
#1 0x0000fffff5f110b4 in uv_cond_wait () from /lib64/libuv.so.1
No symbol table info available.
#2 0x0000fffff68c915c in ?? () from /lib64/libnode.so.64
No symbol table info available.
#3 0x0000fffff610379c in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4 0x0000fffff605699c in thread_start () from /lib64/libc.so.6
No symbol table info available.
Thread 2 (Thread 0xfffff58331b0 (LWP 90)):
#0 0x0000fffff6056ac0 in epoll_pwait () from /lib64/libc.so.6
No symbol table info available.
#1 0x0000fffff5f1405c in uv.io_poll () from /lib64/libuv.so.1
No symbol table info available.
#2 0x0000fffff5f04b6c in uv_run () from /lib64/libuv.so.1
No symbol table info available.
#3 0x0000fffff68cc6f4 in node::BackgroundTaskRunner::DelayedTaskScheduler::Start()::{lambda(void*)#1}::_FUN(void*) () from /lib64/libnode.so.64
No symbol table info available.
#4 0x0000fffff610379c in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#5 0x0000fffff605699c in thread_start () from /lib64/libc.so.6
No symbol table info available.
Thread 1 (Thread 0xfffff5847cb0 (LWP 87)):
#0 0x0000fffff5fb9bd0 in raise () from /lib64/libc.so.6
No symbol table info available.
#1 0x0000fffff5fa79a8 in abort () from /lib64/libc.so.6
No symbol table info available.
#2 0x0000ffffee2b1ff4 in ?? () from /usr/lib64/pulseaudio/libpulsecommon-12.2.so
No symbol table info available.
#3 0x0000ffffee2b4528 in ?? () from /usr/lib64/pulseaudio/libpulsecommon-12.2.so
No symbol table info available.
#4 0x0000ffffee2b48c0 in ?? () from /usr/lib64/pulseaudio/libpulsecommon-12.2.so
No symbol table info available.
#5 0x0000ffffee2b5124 in ?? () from /usr/lib64/pulseaudio/libpulsecommon-12.2.so
No symbol table info available.
#6 0x0000fffff5f13fe0 in uv.io_poll () from /lib64/libuv.so.1
No symbol table info available.
#7 0x0000fffff5f04b6c in uv_run () from /lib64/libuv.so.1
No symbol table info available.
#8 0x0000fffff6846ae0 in node::Start(v8::Isolate*, node::IsolateData*, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&) () from /lib64/libnode.so.64
No symbol table info available.
#9 0x0000fffff6845174 in node::Start(int, char**) () from /lib64/libnode.so.64
No symbol table info available.
#10 0x0000aaaaaaaaac70 in main (argc=2, argv=0xfffffffff5a8) at ../src/node_main.cc:124
envp = <optimized out>
auxv = <optimized out>
(gdb) set logging off
Done logging to /var/lib/almond-server/pulsecore_backtrace.log.
(gdb) quit
A debugging session is active.
Inferior 1 [process 87] will be killed.
Quit anyway? (y or n) y
Traceback (most recent call last):
File "/usr/local/bin/precise-engine", line 11, in <module>
load_entry_point('mycroft-precise==0.3.0', 'console_scripts', 'precise-engine')()
File "/usr/local/lib/python3.7/site-packages/precise/scripts/engine.py", line 58, in main
stdout.buffer.flush()
BrokenPipeError: [Errno 32] Broken pipe
Please let me know which additional debuginfo you would like me to install from the above list.
Thanks for the backtrace. If you can it would be nice to retake the backtrace with debug symbols. Ideally, you can install all the debuginfo packages that gdb suggests. If not, at least libcanberra
and pulseaudio-libs
would be nice. If you use dnf debuginfo-install
inside the container that will also bring in the deps.
I couldn't get almond to start cleanly with all the debug symbols installed, so reverted to just libcanberra and pulseaudio-libs. Will report back on failure.
Here you go:
Thanks! The crash reason is unfortunately still not obvious from the backtrace, but now I have enough to investigate. I'll let you know if I find a fix.
Thanks for your time on this issue. I have enabled core dump in the meantime just in case are are able to glean any further info.
I have not seen this issue since upgrading to beta 1.8.0 and now running the released version. So I think we can close this one.
Almond Server v1.7.2 - Locally hosted. Home Assistant 0.103.5 Ubuntu 19.10 - headless server
Almond fails at random times after startup and never stay up for longer than for say 7 hours. You have to then manually restart the server which clearly is not practical.
The failure is always the same as shown in the logs below and points to pulseaudio but my research have not found a workable solution.
I start pulseaudio with the following command:
almond logs:
system journal logs: