ucsb-eri / surfcam

campus point surfcam
GNU General Public License v3.0
5 stars 1 forks source link

any chance to make this reliable? #2

Closed tve closed 1 year ago

tve commented 1 year ago

It seems to me that most of the times I look at the youtube page (https://www.youtube.com/watch?v=dmxSFIMXY30) the streaming is not working. Is there any chance to increase the reliability? Is ffmpeg quitting? Or freezing? Or what happens?

slapplebags commented 1 year ago

here's the dump form the latest failure: Stream #0:0: Audio: pcm_u8, 44100 Hz, stereo, u8, 705 kb/s Input #1, flv, from 'rtmp://128.111.28.194/bcs/channel0_main.bcs?channel=0&stream=0&user=&password=: Metadata: |RtmpSampleAccess: true displayWidth : 2560 displayHeight : 1440 Duration: 00:00:00.00, start: 1094265.234000, bitrate: N/A Stream #1:0: Video: h264 (High), yuv420p(progressive), 2560x1440, 24 fps, 25 tbr, 1k tbn Stream #1:1: Audio: aac (LC), 16000 Hz, mono, fltp Stream mapping: Stream #1:0 -> #0:0 (h264 (native) -> h264 (h264_nvenc)) Stream #0:0 -> #0:1 (pcm_u8 (native) -> mp3 (libmp3lame)) Press [q] to stop, [?] for help Output #0, flv, to 'rtmp://a.rtmp.youtube.com/live2/': Metadata: encoder : Lavf58.76.100 Stream #0:0: Video: h264 (Main) ([7][0][0][0] / 0x0007), yuv420p(progressive), 2560x1440, q=2-31, 9500 kb/s, 25 fps, 1k tbn Metadata: encoder : Lavc58.134.100 h264_nvenc Side data: cpb: bitrate max/min/avg: 9500000/0/9500000 buffer size: 9500000 vbv_delay: N/A Stream #0:1: Audio: mp3 ([2][0][0][0] / 0x0002), 44100 Hz, stereo, s16p Metadata: encoder : Lavc58.134.100 libmp3lame frame= 1 fps=0.0 q=0.0 size= 0kB time=00:00:00.00 bitrate=N/A speed= frame= 54 fps=0.0 q=43.0 size= 1707kB time=00:00:01.44 bitrate=9706.8kbits/frame= 83 fps= 55 q=40.0 size= 2899kB time=00:00:02.60 bitrate=9131.0kbits/frame= 96 fps= 47 q=42.0 size= 3552kB time=00:00:03.12 bitrate=9322.3kbits/frame= 109 fps= 43 q=45.0 size= 4122kB time=00:00:03.64 bitrate=9274.4kbits/frame= 122 fps= 40 q=42.0 size= 4632kB time=00:00:04.16 bitrate=9119.1kbits/frame= 135 fps= 38 q=39.0 size= 5166kB time=00:00:04.68 bitrate=9040.5kbits/frame= 147 fps= 36 q=40.0 size= 5745kB time=00:00:05.16 bitrate=9119.0kbits/frame= 161 fps= 35 q=45.0 size= 6412kB time=00:00:05.72 bitrate=9180.4kbits/frame= 174 fps= 34 q=45.0 size= 6973kB time=00:00:06.24 bitrate=9149.0kbits/frame= 187 fps= 33 q=39.0 size= 7721kB time=00:00:06.76 bitrate=9347.3kbits/frame= 200 fps= 32 q=42.0 size= 8241kB time=00:00:07.28 bitrate=9261.7kbits/frame= 213 fps= 32 q=45.0 size= 8795kB time=00:00:07.81 bitrate=9222.7kbits/frame= 226 fps= 31 q=42.0 size= 9301kB time=00:00:08.32 bitrate=9156.4kbits/frame= 238 fps= 31 q=45.0 size= 9857kB time=00:00:08.80 bitrate=9171.9kbits/frame= 251 fps= 30 q=39.0 size= 10604kB time=00:00:09.32 bitrate=9313.3kbits/frame= 264 fps= 30 q=40.0 size= 11180kB time=00:00:09.84 bitrate=9298.7kbits/frame= 277 fps= 30 q=45.0 size= 11689kB time=00:00:10.37 bitrate=9232.2kbits/frame= 290 fps= 30 q=45.0 size= 12249kB time=00:00:10.88 bitrate=9222.0kbits/frame= 303 fps= 29 q=39.0 size= 12784kB time=00:00:11.40 bitrate=9185.6kbits/frame= 316 fps= 29 q=39.0 size= 13511kB time=00:00:11.92 bitrate=9284.8kbits/frame= 328 fps= 29 q=39.0 size= 14063kB time=00:00:12.40 bitrate=9283.8kbits/frame= 342 fps= 29 q=42.0 size= 14597kB time=00:00:12.96 bitrate=9225.9kbits/frame= 355 fps= 29 q=39.0 size= 15181kB time=00:00:13.48 bitrate=9225.0kbits/frame= 368 fps= 28 q=42.0 size= 15902kB time=00:00:14.00 bitrate=9303.2kbits/frame= 381 fps= 28 q=45.0 size= 16409kB time=00:00:14.52 bitrate=9257.0kbits/frame= 394 fps= 28 q=45.0 size= 16965kB time=00:00:15.04 bitrate=9235.5kbits/frame= 407 fps= 28 q=39.0 size= 17516kB time=00:00:15.56 bitrate=9221.1kbits/frame= 420 fps= 28 q=42.0 size= 18216kB time=00:00:16.08 bitrate=9279.4kbits/frame= 433 fps= 28 q=45.0 size= 18782kB time=00:00:16.60 bitrate=9268.2kbits/frame= 446 fps= 28 q=45.0 size= 19335kB time=00:00:17.12 bitrate=9251.3kbits/frame= 459 fps= 28 q=39.0 size= 19885kB time=00:00:17.64 bitrate=9234.1kbits/frame= 472 fps= 28 q=42.0 size= 20613kB time=00:00:18.16 bitrate=9297.9kbits/frame= 485 fps= 28 q=45.0 size= 21126kB time=00:00:18.68 bitrate=9264.0kbits/frame= 497 fps= 27 q=45.0 size= 21704kB time=00:00:19.16 bitrate=9279.4kbits/frame= 512 fps= 27 q=39.0 size= 22454kB time=00:00:19.76 bitrate=9308.4kbits/frame= 525 fps= 27 q=45.0 size= 23008kB time=00:00:20.28 bitrate=9293.5kbits/frame= 538 fps= 27 q=42.0 size= 23535kB time=00:00:20.80 bitrate=9268.6kbits/frame= 551 fps= 27 q=39.0 size= 24089kB time=00:00:21.32 bitrate=9255.7kbits/frame= 564 fps= 27 q=42.0 size= 24783kB time=00:00:21.84 bitrate=9295frame= 590 fps= 27 qBus error (core dumped)23.0 size=14217332kB time=03:27:45.92 bitrate=9342.9kbits/s dup=0 drop=77 speed= 1x root@surfcam-backup:~# system is a VM running ubuntu 22.04 with 8GB of RAM, 4 cores of xeon processor, and pci passthrough of a quadro p400 GPU

tve commented 1 year ago

Interesting, so ffmpeg dumps core... Are you using an up-to-date version (it normally prints a whole header with version info when it starts).

In the systemd service file you have restart: on-failure, that should kick in on a core dump and restart ffmpeg, does it not? The trace you posted has ffmpeg run for only one minute, that's rather uncool...

BTW, does the following mean that you're not using the GPU for the decoding of the incoming video?

Stream https://github.com/ucsb-eri/surfcam/pull/1:0 -> #0:0 (h264 (native) -> h264 (h264_nvenc))
slapplebags commented 1 year ago

ffmpeg -version ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers built with gcc 11 (Ubuntu 11.2.0-19ubuntu1) configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared libavutil 56. 70.100 / 56. 70.100 libavcodec 58.134.100 / 58.134.100 libavformat 58. 76.100 / 58. 76.100 libavdevice 58. 13.100 / 58. 13.100 libavfilter 7.110.100 / 7.110.100 libswscale 5. 9.100 / 5. 9.100 libswresample 3. 9.100 / 3. 9.100 libpostproc 55. 9.100 / 55. 9.100

the restart: on failure doesnt seem to kick in.

my understanding is that h264_nvenc specifies using the nvidia hardware decoder and nvidia-smi shows that ffmpeg is using the GPU for something: nvidia-smi Mon Apr 3 10:18:57 2023
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Quadro P400 On | 00000000:01:01.0 Off | N/A | | 34% 42C P0 N/A / N/A | 218MiB / 2048MiB | 2% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 1745169 C ffmpeg 214MiB | +-----------------------------------------------------------------------------+

tve commented 1 year ago

h264_nvenc is the encoder, but GPU decoding may not matter much ffmpeg --version on my (Arch LInux) system says: ffmpeg version n5.1.2 Copyright (c) 2000-2022 the FFmpeg developers Hmm...

You could try changing on-failure to always. Also systemctl status surfcam after a failure to restart might give you some hints about why it's not restarting.

slapplebags commented 1 year ago

updated the service restart option to always. I'll dump the output of systemctl status surfcam after the next failure. Mean time to failure is about two weeks.

slapplebags commented 1 year ago

well that didnt take long, here is the output of systemctl status surfcam: ● surfcam.service Loaded: loaded (/etc/systemd/system/surfcam.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2023-04-04 10:20:02 PDT; 4min 39s ago Main PID: 1771774 (surfcam.sh) Tasks: 24 (limit: 9396) Memory: 204.7M CPU: 3min 59.359s CGroup: /system.slice/surfcam.service ├─1771774 /bin/bash /surf/surfcam.sh └─1771775 ffmpeg -hwaccel cuda -hwaccel_output_format cuda -f lavfi -i anullsrc -thread_queue_size 4096 -err_detect ignore_err -i "rtmp:///bcs/channel0_main.bcs?channel=0&stream=0&user=&password=" -vf "drawtext=fontfile=/usr/share/fonts/truetype/dejavu/DejaVuSansMono.ttf:textfile=/surf/data.txt:reload=1:fontcolor=white:fontsize=44:box=1:boxcolor=black@0.5:boxborderw=5:x=(w-text_w)/2:y=1300" -c:v h264_nvenc -pix_fmt yuv420p -b:v 9500k -maxrate 9500k -bufsize 9500k -f flv -g 4 rtmp://a.rtmp.youtube.com/live2/ Apr 04 10:20:02 surfcam-backup systemd[1]: Started surfcam.service.

tve commented 1 year ago

It says Active: active (running) since Tue 2023-04-04 10:20:02 PDT; 4min 39s ago so it restarted (successfully?) a few mins ago. The youtube channel is not streaming as far as I can tell. How do you get it to work again? I wonder if you restart manually now (sudo systemctl restart surfcam) whether it starts working. That may indicate that the restart timeout of 1 second you have may be too short (youtube may still this the old connection is working and refuse a second one?). Your setup "ought to work" but it looks like there are some rough edges to discover and fiddle with... Thanks for keeping at it!

On Tuesday 04 April 2023 10:26:10 AM (-07:00), slapplebags wrote:

well that didnt take long, here is the output of systemctl status surfcam: ● surfcam.service Loaded: loaded (/etc/systemd/system/surfcam.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2023-04-04 10:20:02 PDT; 4min 39s ago Main PID: 1771774 (surfcam.sh) Tasks: 24 (limit: 9396) Memory: 204.7M CPU: 3min 59.359s CGroup: /system.slice/surfcam.service ├─1771774 /bin/bash /surf/surfcam.sh └─1771775 ffmpeg -hwaccel cuda -hwaccel_output_format cuda -f lavfi -i anullsrc -thread_queue_size 4096 -err_detect ignore_err -i "rtmp:///bcs/channel0_main.bcs?channel=0&stream=0&user=&password=" -vf @.***:boxborderw=5:x=(w-text_w)/2:y=1300" -c:v h264_nvenc -pix_fmt yuv420p -b:v 9500k -maxrate 9500k -bufsize 9500k -f flv -g 4 rtmp://a.rtmp.youtube.com/live2/ Apr 04 10:20:02 surfcam-backup systemd[1]: Started surfcam.service.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.

--

slapplebags commented 1 year ago

i have a cron job set to restart the script every so often as that seemed to help with uptime, i ran this not too long after that script fired, hence the short uptime. Stream should be back up now, when this happens i have to stop the surfcam service, log into youtube and go to the start a stream page, let the old stream time out and then restart the script. Your timeout theory may be worth pursuing.

I've also turned off the data for a bit to see if that has an impact on things.

tve commented 1 year ago

Without the data you wouldn't need to decode and re-encode the stream... But the stream will break sooner or later regardless so it's prob worth it to figure out how to make the restart reliable. I would try a 5 or 10 minute restart timeout. If that does the trick you can tighten it down.

tve commented 1 year ago

Hmm, I'm reading that after some time of inactivity youtube closes the stream and then one has to manually create a new one. So starting with a large restart-timeout may not work either...

slapplebags commented 1 year ago

i enabled some better logging so hopefully we can catch something there. There is a youtube API that may be helpful for auto restarting the stream but i've not gotten too far into that yet.

tve commented 1 year ago

I was just looking at the API too. Would be nice if there are some simple tools for it....

slapplebags commented 1 year ago

was tailing the log file and didnt see a single issue with that last outage: cat /surf/log-2023-04-04.txt ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers built with gcc 11 (Ubuntu 11.2.0-19ubuntu1) configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared libavutil 56. 70.100 / 56. 70.100 libavcodec 58.134.100 / 58.134.100 libavformat 58. 76.100 / 58. 76.100 libavdevice 58. 13.100 / 58. 13.100 libavfilter 7.110.100 / 7.110.100 libswscale 5. 9.100 / 5. 9.100 libswresample 3. 9.100 / 3. 9.100 libpostproc 55. 9.100 / 55. 9.100 Input #0, lavfi, from 'anullsrc': Duration: N/A, start: 0.000000, bitrate: 705 kb/s Stream #0:0: Audio: pcm_u8, 44100 Hz, stereo, u8, 705 kb/s Input #1, flv, from 'rtmp://128.111.28.194/bcs/channel0_main.bcs?channel=0&stream=0&user=&password=': Metadata: |RtmpSampleAccess: true displayWidth : 2560 displayHeight : 1440 Duration: 00:00:00.00, start: 1244556.521000, bitrate: N/A Stream #1:0: Video: h264 (High), yuv420p(progressive), 2560x1440, 24 fps, 25 tbr, 1k tbn Stream #1:1: Audio: aac (LC), 16000 Hz, mono, fltp Stream mapping: Stream #1:0 -> #0:0 (h264 (native) -> h264 (h264_nvenc)) Stream #0:0 -> #0:1 (pcm_u8 (native) -> mp3 (libmp3lame)) Press [q] to stop, [?] for help Output #0, flv, to 'rtmp://a.rtmp.youtube.com/live2/': Metadata: encoder : Lavf58.76.100 Stream #0:0: Video: h264 (Main) ([7][0][0][0] / 0x0007), yuv420p(progressive), 2560x1440, q=2-31, 9500 kb/s, 25 fps, 1k tbn Metadata: encoder : Lavc58.134.100 h264_nvenc Side data: cpb: bitrate max/min/avg: 9500000/0/9500000 buffer size: 9500000 vbv_delay: N/A Stream #0:1: Audio: mp3 ([2][0][0][0] / 0x0002), 44100 Hz, stereo, s16p Metadata: encoder : Lavc58.134.100 libmp3lame

slapplebags commented 1 year ago

latest dump: av_interleaved_write_frame(): Immediate exit requested Last message repeated 1 times [flv @ 0x55f1f5900480] Failed to update header with correct duration. [flv @ 0x55f1f5900480] Failed to update header with correct filesize. Error writing trailer of rtmp://a.rtmp.youtube.com/live2/key: Immediate exit requested frame=113198 fps= 25 q=43.0 Lsize= 4980551kB time=01:15:27.20 bitrate=9012.3kbits/s dup=0 drop=28 speed= 1x
video:4904896kB audio:70738kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.098818% Exiting normally, received signal 15.

slapplebags commented 1 year ago

going to try simplifying things until i find something more stable current running commmand is: ffmpeg -hwaccel cuda -hwaccel_output_format cuda -f lavfi -i anullsrc -thread_queue_size 4096 -err_detect ignore_err -i "rtmp://ip/bcs/channel0_main.bcs?channel=0&stream=0&user=user&password=password" -c:v h264_nvenc -pix_fmt yuv420p -f flv -g 4 rtmp://a.rtmp.youtube.com/live2/key 2> /surf/log-$(date +%F).txt

slapplebags commented 1 year ago

couldnt handle the compression so added back the -b:v 9500k

tve commented 1 year ago

Error writing trailer of rtmp://a.rtmp.youtube.com/live2/key: Immediate exit requested

That sounds mostly like a network error or something closed/aborted at the youtube end? I have the feeling you're fighting two issues: making ffmpeg more reliable and restarting a youtube live stream. If you manage to get the second to work well then ffmpeg crashes are not so important (as long as they're not too frequent). If you manage to fix the first you will still need to deal with the second issue 'cause even if ffmpeg doesn't crash I'm sure there will be other hiccups that interrupt the live stream.

slapplebags commented 1 year ago

outage this morning was a campus wide network issue

slapplebags commented 1 year ago

another campus network issue took it down again today

slapplebags commented 1 year ago

same today, campus network took down the stream. i do now have a script i'm testing for active monitoring so at least i'll know when its down.

slapplebags commented 1 year ago

healthcheck.py is done and tested, its called by a cron every 5 minutes and if the stream is dead it will restart it automatically. this coupled with the improved uptime makes me comfortable with calling this ticket closed.