obsproject / obs-studio

OBS Studio - Free and open source software for live streaming and screen recording
https://obsproject.com
GNU General Public License v2.0
60k stars 7.94k forks source link

Linux: JACK audio source not routed to program out #3413

Closed nettings closed 4 years ago

nettings commented 4 years ago

Platform

Operating system and version: openSUSE Tumbleweed 20200819 OBS Studio version: git master ace0faebd804a1d2c70e9cc6509df3bc61f79591 jack2 git of Aug 14

Expected Behavior

JACK audio sources should be audible on the recording output.

Current Behavior

They are not.

Steps to Reproduce

  1. Starting with a clean new session (.config/obs-studio deleted), I'm adding a JACK audio source to a scene and switch it to Program Out.
  2. I connect a signal on the JACK side. The OBS audio mixer shows the signal is there.

No audio on the output.

Additional information

If I activate monitoring of the JACK input source and bring up the "Desktop Audio" fader, the signal is heard in the mix. But this confuses monitor settings and should not be necessary. It might however explain why previous problem reports re JACK sources have been so inconclusive, with users and testers rarely being able to reproduce the problem, since this monitor signal path might be activated without the user noticing it (oh the smartness of pulseaudio...)

This issue might be related to forum post https://obsproject.com/forum/threads/jack-obs-no-sound.123527/#post-474222 and its corresponding issue ticket #3021. The reporting user's setup is more complex, but is consistent with the behaviour I'm seeing.

nettings commented 4 years ago

Using the same OBS session as above, I quickly cross-checked with an ALSA input (of the same device). It behaves as expected, e.g. as long as the source fader is up, the audio is heard in the recording output.

nettings commented 4 years ago
02:04:22 PM.031: CPU Name: AMD Ryzen 9 3950X 16-Core Processor
02:04:22 PM.032: CPU Speed: 3499.771MHz
02:04:22 PM.032: Physical Cores: 16, Logical Cores: 32
02:04:22 PM.032: Physical Memory: 64230MB Total, 56736MB Free
02:04:22 PM.032: Kernel Version: Linux 5.8.0-1-default
02:04:22 PM.032: Distribution: "openSUSE Tumbleweed" "20200819"
02:04:22 PM.033: Window System: X11.0, Vendor: The X.Org Foundation, Version: 1.20.8
02:04:22 PM.034: Portable mode: false
02:04:22 PM.115: OBS 26.0.0-rc1-36-gace0faeb-modified (linux)
02:04:22 PM.115: ---------------------------------
02:04:22 PM.335: ---------------------------------
02:04:22 PM.335: audio settings reset:
02:04:22 PM.335:    samples per sec: 48000
02:04:22 PM.335:    speakers:        2
02:04:22 PM.338: ---------------------------------
02:04:22 PM.338: Initializing OpenGL...
02:04:22 PM.534: Loading up OpenGL on adapter X.Org Radeon RX 560 Series (POLARIS11, DRM 3.38.0, 5.8.0-1-default, LLVM 10.0.0)
02:04:22 PM.534: OpenGL loaded successfully, version 4.6 (Core Profile) Mesa 20.1.4, shading language 4.60
02:04:22 PM.549: ---------------------------------
02:04:22 PM.549: video settings reset:
02:04:22 PM.549:    base resolution:   1920x1080
02:04:22 PM.549:    output resolution: 1280x720
02:04:22 PM.549:    downscale filter:  Bicubic
02:04:22 PM.549:    fps:               30/1
02:04:22 PM.549:    format:            NV12
02:04:22 PM.549:    YUV mode:          sRGB/Partial
02:04:22 PM.549: NV12 texture support not available
02:04:22 PM.551: Audio monitoring device:
02:04:22 PM.551:    name: Default
02:04:22 PM.551:    id: default
02:04:22 PM.551: ---------------------------------
02:04:22 PM.552: Failed to load 'en-US' text for module: 'decklink-ouput-ui.so'
02:04:22 PM.652: Decklink API Compiled version 10.11.4
02:04:22 PM.652: Decklink API Installed version 11.6
02:04:22 PM.655: [obs-browser]: Version 2.8.6
02:04:22 PM.658: [obs-ndi] hello ! (version 4.9.1)
02:04:22 PM.658: [obs-ndi] Trying ''
02:04:22 PM.658: [obs-ndi] Trying '/usr/lib'
02:04:22 PM.658: [obs-ndi] Trying '/usr/local/lib'
02:04:22 PM.658: [obs-ndi] Found NDI library at '/usr/local/lib/libndi.so.4'
02:04:22 PM.659: [obs-ndi] NDI runtime loaded successfully
02:04:22 PM.659: [obs-ndi] NDI library initialized successfully (NDI SDK LINUX 00:05:02 Apr  1 2020 4.5.1)
02:04:22 PM.746: VLC found, VLC video source enabled
02:04:22 PM.746: ---------------------------------
02:04:22 PM.746:   Loaded Modules:
02:04:22 PM.746:     vlc-video.so
02:04:22 PM.746:     v4l2sink.so
02:04:22 PM.746:     text-freetype2.so
02:04:22 PM.746:     rtmp-services.so
02:04:22 PM.746:     obs-x264.so
02:04:22 PM.746:     obs-vst.so
02:04:22 PM.746:     obs-transitions.so
02:04:22 PM.746:     obs-outputs.so
02:04:22 PM.746:     obs-ndi.so
02:04:22 PM.747:     obs-libfdk.so
02:04:22 PM.747:     obs-filters.so
02:04:22 PM.747:     obs-ffmpeg.so
02:04:22 PM.747:     obs-browser.so
02:04:22 PM.747:     linux-v4l2.so
02:04:22 PM.747:     linux-pulseaudio.so
02:04:22 PM.747:     linux-jack.so
02:04:22 PM.747:     linux-decklink.so
02:04:22 PM.747:     linux-capture.so
02:04:22 PM.747:     linux-alsa.so
02:04:22 PM.747:     image-source.so
02:04:22 PM.747:     frontend-tools.so
02:04:22 PM.747:     decklink-ouput-ui.so
02:04:22 PM.747: ---------------------------------
02:04:22 PM.747: os_dlopen(../obs-plugins/obs-browser->../obs-plugins/obs-browser.so): ../obs-plugins/obs-browser.so: cannot open shared object file: No such file or directory
02:04:22 PM.747: 
02:04:22 PM.747: ==== Startup complete ===============================================
02:04:22 PM.756: No scene file found, creating default scene
02:04:22 PM.756: All scene data cleared
02:04:22 PM.756: ------------------------------------------------
02:04:22 PM.759: pulse-input: Server name: 'pulseaudio 13.0-rebootstrapped'
02:04:22 PM.760: pulse-input: Audio format: s16le, 44100 Hz, 2 channels
02:04:22 PM.760: pulse-input: Started recording from 'alsa_output.pci-0000_34_00.4.analog-stereo.monitor'
02:04:22 PM.760: Switched to scene 'Scene'
02:04:22 PM.760: Failed to glob scene collections
02:04:23 PM.662: adding 42 milliseconds of audio buffering, total audio buffering is now 42 milliseconds (source: Desktop Audio)
02:04:23 PM.662: 
02:04:23 PM.686: [ffmpeg] [AVIOContext @ 0x7f5c1c009900] Statistics: 386 bytes read, 0 seeks
02:06:45 PM.689: Settings changed (outputs)
02:06:45 PM.689: ------------------------------------------------
02:06:54 PM.667: Using muxer settings: 
02:06:54 PM.667:    content=video/ts
02:06:54 PM.711: Invalid muxer settings: 
02:06:54 PM.711:    content=video/ts
02:06:54 PM.974: ==== Recording Start ===============================================
02:07:44 PM.662: v4l2-input: Start capture from 
02:07:44 PM.662: v4l2-input: Unable to open device
02:07:44 PM.662: v4l2-input: Initialization failed
02:07:44 PM.663: User added source 'usbcam' (v4l2_input) to scene 'Scene'
02:07:44 PM.665: v4l2-input: /dev/video1 seems to not support video capture
02:07:44 PM.665: v4l2-input: Found device 'Trust Full HD Webcam: Trust Ful' at /dev/video0
02:07:44 PM.667: v4l2-input: /dev/video1 seems to not support video capture
02:07:44 PM.667: v4l2-input: Found device 'Trust Full HD Webcam: Trust Ful' at /dev/video0
02:07:44 PM.667: v4l2-input: Found input 'Camera 1' (Index 0)
02:07:44 PM.685: v4l2-input: Start capture from /dev/video0
02:07:44 PM.690: v4l2-controls: setting default for Power Line Frequency to 1
02:07:44 PM.690: v4l2-input: Input: 0
02:07:44 PM.690: v4l2-input: Selected video format not supported
02:07:44 PM.690: v4l2-input: Initialization failed
02:07:44 PM.708: v4l2-controls: setting default for Exposure, Auto to 3
02:07:44 PM.732: v4l2-input: Pixelformat: Motion-JPEG (unavailable)
02:07:44 PM.732: v4l2-input: Pixelformat: YUYV 4:2:2 (available)
02:07:44 PM.732: v4l2-input: Pixelformat: RGB3 (Emulated) (unavailable)
02:07:44 PM.732: v4l2-input: Pixelformat: BGR3 (Emulated) (available)
02:07:44 PM.732: v4l2-input: Pixelformat: YU12 (Emulated) (available)
02:07:44 PM.732: v4l2-input: Pixelformat: YV12 (Emulated) (available)
02:07:44 PM.732: v4l2-input: Stepwise and Continuous framesizes are currently hardcoded
02:07:44 PM.733: v4l2-input: Stepwise and Continuous framerates are currently hardcoded
02:07:44 PM.734: v4l2-input: Pixelformat: Motion-JPEG (unavailable)
02:07:44 PM.734: v4l2-input: Pixelformat: YUYV 4:2:2 (available)
02:07:44 PM.734: v4l2-input: Pixelformat: RGB3 (Emulated) (unavailable)
02:07:44 PM.734: v4l2-input: Pixelformat: BGR3 (Emulated) (available)
02:07:44 PM.734: v4l2-input: Pixelformat: YU12 (Emulated) (available)
02:07:44 PM.734: v4l2-input: Pixelformat: YV12 (Emulated) (available)
02:07:44 PM.734: v4l2-input: Stepwise and Continuous framesizes are currently hardcoded
02:07:44 PM.734: v4l2-input: Stepwise and Continuous framerates are currently hardcoded
02:07:44 PM.736: v4l2-input: Stepwise and Continuous framerates are currently hardcoded
02:07:44 PM.804: v4l2-input: Start capture from /dev/video0
02:07:44 PM.805: v4l2-input: Input: 0
02:07:44 PM.822: v4l2-input: Resolution: 1920x1080
02:07:44 PM.822: v4l2-input: Pixelformat: VYUY
02:07:44 PM.822: v4l2-input: Linesize: 3840 Bytes
02:07:44 PM.822: v4l2-input: Framerate: 5.00 fps
02:07:49 PM.318: v4l2-input: Stepwise and Continuous framerates are currently hardcoded
02:07:49 PM.471: v4l2-input: Stopped capture after 21 frames
02:07:49 PM.493: v4l2-input: Start capture from /dev/video0
02:07:49 PM.494: v4l2-input: Input: 0
02:07:49 PM.502: v4l2-input: Resolution: 1920x1080
02:07:49 PM.502: v4l2-input: Pixelformat: 21UY
02:07:49 PM.502: v4l2-input: Linesize: 1920 Bytes
02:07:49 PM.502: v4l2-input: Framerate: 30.00 fps
02:07:54 PM.397: v4l2-input: Stopped capture after 138 frames
02:07:54 PM.422: v4l2-input: Start capture from /dev/video0
02:07:54 PM.422: v4l2-input: Input: 0
02:07:54 PM.430: v4l2-input: Resolution: 1920x1080
02:07:54 PM.430: v4l2-input: Pixelformat: 21UY
02:07:54 PM.430: v4l2-input: Linesize: 1920 Bytes
02:07:54 PM.439: v4l2-input: Framerate: 25.00 fps
02:07:56 PM.765: v4l2-input: Stopped capture after 56 frames
02:07:56 PM.789: v4l2-input: Start capture from /dev/video0
02:07:56 PM.790: v4l2-input: Input: 0
02:07:56 PM.798: v4l2-input: Resolution: 1920x1080
02:07:56 PM.798: v4l2-input: Pixelformat: 21UY
02:07:56 PM.798: v4l2-input: Linesize: 1920 Bytes
02:07:56 PM.807: v4l2-input: Framerate: 25.00 fps
02:07:58 PM.640: v4l2-input: /dev/video1 seems to not support video capture
02:07:58 PM.640: v4l2-input: Found device 'Trust Full HD Webcam: Trust Ful' at /dev/video0
02:07:58 PM.640: v4l2-input: Found input 'Camera 1' (Index 0)
02:07:58 PM.640: v4l2-controls: setting default for Power Line Frequency to 1
02:07:58 PM.640: v4l2-controls: setting default for Exposure, Auto to 3
02:07:58 PM.640: v4l2-input: Pixelformat: Motion-JPEG (unavailable)
02:07:58 PM.640: v4l2-input: Pixelformat: YUYV 4:2:2 (available)
02:07:58 PM.640: v4l2-input: Pixelformat: RGB3 (Emulated) (unavailable)
02:07:58 PM.640: v4l2-input: Pixelformat: BGR3 (Emulated) (available)
02:07:58 PM.640: v4l2-input: Pixelformat: YU12 (Emulated) (available)
02:07:58 PM.640: v4l2-input: Pixelformat: YV12 (Emulated) (available)
02:07:58 PM.641: v4l2-input: Pixelformat: Motion-JPEG (unavailable)
02:07:58 PM.641: v4l2-input: Pixelformat: YUYV 4:2:2 (available)
02:07:58 PM.641: v4l2-input: Pixelformat: RGB3 (Emulated) (unavailable)
02:07:58 PM.641: v4l2-input: Pixelformat: BGR3 (Emulated) (available)
02:07:58 PM.641: v4l2-input: Pixelformat: YU12 (Emulated) (available)
02:07:58 PM.641: v4l2-input: Pixelformat: YV12 (Emulated) (available)
02:08:23 PM.224: Output 'adv_ffmpeg_output': stopping
02:08:23 PM.224: Output 'adv_ffmpeg_output': Total frames output: 2655
02:08:23 PM.224: Output 'adv_ffmpeg_output': Total drawn frames: 2646 (2657 attempted)
02:08:23 PM.224: Output 'adv_ffmpeg_output': Number of lagged frames due to rendering lag/stalls: 11 (0.4%)
02:08:23 PM.224: ==== Recording Stop ================================================
02:08:36 PM.552: ---------------------------------
02:08:36 PM.552: video settings reset:
02:08:36 PM.552:    base resolution:   1920x1080
02:08:36 PM.552:    output resolution: 1920x1080
02:08:36 PM.552:    downscale filter:  Bicubic
02:08:36 PM.552:    fps:               25/1
02:08:36 PM.552:    format:            NV12
02:08:36 PM.552:    YUV mode:          sRGB/Partial
02:08:36 PM.552: NV12 texture support not available
02:08:36 PM.784: Settings changed (video)
02:08:36 PM.784: ------------------------------------------------
02:08:39 PM.414: Using muxer settings: 
02:08:39 PM.414:    content=video/ts
02:08:39 PM.453: Invalid muxer settings: 
02:08:39 PM.453:    content=video/ts
02:08:39 PM.561: ==== Recording Start ===============================================
02:09:40 PM.628: Settings changed (outputs)
02:09:40 PM.628: ------------------------------------------------
02:09:43 PM.296: Output 'adv_ffmpeg_output': stopping
02:09:43 PM.296: Output 'adv_ffmpeg_output': Total frames output: 1595
02:09:43 PM.296: Output 'adv_ffmpeg_output': Total drawn frames: 1597
02:09:43 PM.296: ==== Recording Stop ================================================
02:09:44 PM.877: Using muxer settings: 
02:09:44 PM.877:    content="video/ts"
02:09:44 PM.916: Invalid muxer settings: 
02:09:44 PM.916:    content="video/ts"
02:09:45 PM.005: ==== Recording Start ===============================================
02:10:44 PM.764: Output 'adv_ffmpeg_output': stopping
02:10:44 PM.765: Output 'adv_ffmpeg_output': Total frames output: 1495
02:10:44 PM.765: Output 'adv_ffmpeg_output': Total drawn frames: 1498
02:10:44 PM.765: ==== Recording Stop ================================================
02:10:56 PM.040: Settings changed (outputs)
02:10:56 PM.040: ------------------------------------------------
02:10:58 PM.266: Using muxer settings: 
02:10:58 PM.266:    content_type=video/m2ts
02:10:58 PM.266:    ice_genre=Live event
02:10:58 PM.266:    ice_name=vdt Live!
02:10:58 PM.266:    ice_description=A test stream for the upcoming vdt Live! event starting Oct. 7
02:10:58 PM.395: ==== Recording Start ===============================================
02:11:30 PM.497: User added source 'JACK Input Client' (jack_output_capture) to scene 'Scene'
02:11:30 PM.521: Max audio buffering reached!
02:11:30 PM.521: adding 917 milliseconds of audio buffering, total audio buffering is now 960 milliseconds (source: JACK Input Client)
02:11:30 PM.521: 
02:11:42 PM.505: Switched to Preview/Program mode
02:11:42 PM.505: ------------------------------------------------
02:14:30 PM.652: pulse-am: Server name: 'pulseaudio 13.0-rebootstrapped'
02:14:30 PM.652: pulse-am: Audio format: s16le, 44100 Hz, 2 channels
02:14:30 PM.652: pulse-am: Started Monitoring in 'alsa_output.pci-0000_34_00.4.analog-stereo.monitor'
02:14:30 PM.652: User changed audio monitoring for source 'JACK Input Client' to: monitor only
02:19:36 PM.566: pulse-am: Stopped Monitoring in 'alsa_output.pci-0000_34_00.4.analog-stereo.monitor'
02:19:36 PM.566: pulse-am: Got 114713 packets with 13490233 frames
02:19:36 PM.567: User changed audio monitoring for source 'JACK Input Client' to: none
02:21:23 PM.821: User Removed source 'JACK Input Client' (jack_output_capture) from scene 'Scene'
02:22:03 PM.799: alsa-input: Failed to open 'default': Invalid argument
02:22:03 PM.802: User added source 'Audio Capture Device (ALSA)' (alsa_input_capture) to scene 'Scene'
02:22:04 PM.799: alsa-input: Failed to open 'default': Invalid argument
02:22:06 PM.399: alsa-input: Failed to open 'default': Invalid argument
02:22:07 PM.399: alsa-input: Failed to open 'default': Invalid argument
02:22:08 PM.672: alsa-input: PCM 'front:CARD=Webcam,DEV=0' rate set to 48000
02:22:08 PM.672: alsa-input: PCM 'front:CARD=Webcam,DEV=0' channels set to 2
02:22:44 PM.147: User Removed source 'Audio Capture Device (ALSA)' (alsa_input_capture) from scene 'Scene'
02:22:55 PM.983: pulse-input: Server name: 'pulseaudio 13.0-rebootstrapped'
02:22:55 PM.983: pulse-input: Audio format: s16le, 44100 Hz, 2 channels
02:22:55 PM.983: pulse-input: Started recording from 'auto_null.monitor'
02:22:55 PM.986: User added source 'Audio Input Capture (PulseAudio)' (pulse_input_capture) to scene 'Scene'
02:22:55 PM.989: pulse-input: Stopped recording from 'default'
02:22:55 PM.989: pulse-input: Got 0 packets with 0 frames
02:22:55 PM.989: pulse-input: Server name: 'pulseaudio 13.0-rebootstrapped'
02:22:55 PM.989: pulse-input: Audio format: s16le, 44100 Hz, 2 channels
02:22:55 PM.989: pulse-input: Started recording from 'auto_null.monitor'
02:23:00 PM.014: pulse-input: Stopped recording from 'default'
02:23:00 PM.014: pulse-input: Got 162 packets with 178362 frames
02:23:00 PM.015: pulse-input: Server name: 'pulseaudio 13.0-rebootstrapped'
02:23:00 PM.015: pulse-input: Audio format: s16le, 44100 Hz, 2 channels
02:23:00 PM.015: pulse-input: Started recording from 'auto_null.monitor'
02:23:28 PM.381: pulse-input: Stopped recording from 'default'
02:23:28 PM.381: pulse-input: Got 1136 packets with 1250736 frames
02:23:28 PM.381: pulse-input: Server name: 'pulseaudio 13.0-rebootstrapped'
02:23:28 PM.381: pulse-input: Audio format: s16le, 44100 Hz, 2 channels
02:23:28 PM.381: pulse-input: Started recording from 'auto_null.monitor'
02:23:31 PM.412: pulse-input: Stopped recording from 'default'
02:23:31 PM.413: pulse-input: Got 121 packets with 133221 frames
02:23:31 PM.413: pulse-input: Server name: 'pulseaudio 13.0-rebootstrapped'
02:23:31 PM.413: pulse-input: Audio format: s16le, 44100 Hz, 2 channels
02:23:31 PM.413: pulse-input: Started recording from 'auto_null.monitor'
02:23:34 PM.295: User Removed source 'Audio Input Capture (PulseAudio)' (pulse_input_capture) from scene 'Scene'
02:24:02 PM.452: User added source 'Audio Input Capture (PulseAudio)' (pulse_input_capture) to scene 'Scene'
02:24:14 PM.230: User Removed source 'Audio Input Capture (PulseAudio)' (pulse_input_capture) from scene 'Scene'
VennStone commented 4 years ago

This has been an issue for years. I've reported it and so have others. It will be closed because...

02:11:30 PM.497: User added source 'JACK Input Client' (jack_output_capture) to scene 'Scene' 02:11:30 PM.521: Max audio buffering reached!

It's an OBS issue but whatcha gonna do? Stick with Pulseaudio-Jack bridge for interfacing with OBS. OBS is the only Jack aware application we don't use, well, with Jack.

At the very least it shouldn't fail silently.

kkartaltepe commented 4 years ago

As stated

02:11:30 PM.497: User added source 'JACK Input Client' (jack_output_capture) to scene 'Scene'
02:11:30 PM.521: Max audio buffering reached!

Your audio chain is out of sync so OBS will drop the desynced audio once it reaches its max buffering. Its recommended you figure out why your audio chain is out of sync.

VennStone commented 4 years ago

Any idea what pulseaudio-module-jack does to put the Jack stream "in sync"?

Genuinely curious since I run jackd in synchronous mode between 5 computers in the studio.

nettings commented 4 years ago

@kkartaltepe, first of all, can you point me to any documentation (or source files) where I can learn how OBS deals with audio streams running on different sample clocks?

The root of the problem (I guess) is that you cannot configure OBS' audio outputs (if there is, please educate me - for now I haven't even figured out how to use native alsa, as only pulseaudio seems to work), and that I'm basically running two non-synchronized audio clocks: the 48000 Hz oscillator from my mainboard soundcard (which I'm using to monitor via pulseaudio), and the 48000 Hz oscillator of my RME HDSP MADI card (the JACK universe that all my audio gear is wired to). Unfortunately, there is no way to sync those. But that also means that the JACK inputs should work for nobody, unless they are very lucky (i.e. their clocks are magically the same). But the sync error cannot completely explain what I'm seeing, because while I don't get JACK inputs into the recording chain, I can turn on JACK input monitoring, and then, when I also bring up the default "Desktop Audio" fader, I magically do hear jack audio on my recording output (at the cost of rendering monitoring unusable for its real purpose). Is this pulseaudio taking care of timing mismatches? Or just using huuuge buffers? Can you explain what's going on here?

Ideally, to make OBS suitable for professional audio, we'd want to be able to

Given that OBS can run on CoreAudio on the Mac, I would guess that it is already callback-based, i.e. it is the audio daemon telling OBS when to read or write incoming or outgoing audio buffers rather than OBS telling some hardware to flush buffers. So it should already be compatible to JACK on a fundamental level. Correct me if I'm wrong.

I have failed to find information on how OBS deals with different source and sink timings, and I think this is crucial for users to understand. I'd be willing to write some user-level documentation on this as I go, if you can point me in the right direction. The same issue will come up with NDI btw. Is there some audio-specific developer forum that I might join to get answers to those questions (and maybe point out some general underlying sync issues that might need to be addressed)?

kkartaltepe commented 4 years ago

https://obsproject.com/docs/backend-design.html#general-audio-pipeline-overview is the documentation on the audio pipeline.

If you are interested in the implementation of the jack plugin you can review its source in plugins/linux-jack

nettings commented 4 years ago

Thank you very much, that is extremely helpful. At first glance, I see that the jack plugin is not realtime safe (mutex locking is not guaranteed to finish within bounded time)... a year or so ago, I fixed this problem for shairport-sync, but that whole thing was a lot simpler, I guess. Will look at it. There is also a lot of locking going on in do_audio_output and friends. libjack comes with a really nice implementation of a lock-free ringbuffer, which might be useful for core OBS as well.

kkartaltepe commented 4 years ago

OBS is not intended to be run with realtime scheduling, and the jack plugin is probably the least of your worries if you were trying to make it "realtime safe". Realtime scheduling is also not needed to maintain the very coarse 1s sync required by obs.

While im sure there are improvements that could be made to OBSs performance, I find it unlikely that a little bit of locking in OBS is the reason your audio is more than 1s out of sync. If I were you i would look at why the timestamps for your samples are so far out of sync with what OBS expects.

nettings commented 4 years ago

The jack thread will run with realtime scheduling (effected by the jack server, not by OBS code), and so it is important that it never block:

     -  TS  19   0  5170 -      /usr/bin/xfce4-terminal -x obs
     -  TS  19   0  5170 -      /usr/bin/xfce4-terminal -x obs
     -  TS  19   0  5170 -      /usr/bin/xfce4-terminal -x obs
     -  TS  19   0  5170 -      /usr/bin/xfce4-terminal -x obs
     -  TS  19   0  5175 -      obs
<...>
    40  FF  80   -  5175 -      obs

The last one is the jack thread, as you can see it runs in the SCHED_FIFO scheduling class with a realtime priority set by the jack daemon (40 in my case, i.e. below Linux hardware handlers, but it might well be higher). So the thread even has the potential to starve out kernel drivers, and that spells potential deadlock.

As to your comment of my audio being "more than 1s out of sync", the issue is not with my system. PCM devices have no notion of presentation time or any other sort of timestamp. OBS probably gets the notion of presentation time from pulseaudio, but that is basically a made-up value.

JACK devices will not have that, and that means that if the JACK plugin (or the ALSA input, same thing there) works for anyone, it's probably by coincidence.

It would work if all devices were controlled by pulseaudio, but that is not an option for studio environments. The main reason is latency, but also that pulse trades ease of use and cleverness for a lack of determinism that can be quite disruptive in complex settings, so pro users tend to disable pulse altogether or bridge pulseaudio into jack, with jack retaining ultimate control of timing (or rather, the hardware dictating that).

Please don't take my comments the wrong way, I'm totally in awe of OBS and I would like to contribute in a constructive way. But I guess the problem is that the audio and video worlds literally run at different clocks with completely different precision requirements. Also, while you guys are basically saving my ass right now by making OBS cross-platform (which enables me to do a complex project next week that couldn't be done on Linux alone), it also means that you have to deal with three fundamentally different audio architectures...

Please reopen this bug and let's gather more information on the underlying issues. I will also file a bug about the jack plugin eventually, once I've fully understood the issues, and hopefully suggest a patch to fix it.

kkartaltepe commented 4 years ago

Bugs are only left open if they are confirmed a bug in OBS. You are free to research and update this bug if you want, but it still doesnt seem to be a bug in OBS in my experience. If you have compelling evidence or a patch that fixes it we can of course reopen the bug.

kkartaltepe commented 4 years ago

Another great way to try and get this bug opened would be to provide replication steps in software. If it really isnt an issue with your hardware it should be possible to construct such a setup.

daniel-j commented 3 years ago

Ran into this exact issue yesterday (Max audio buffering reached!) for a livestreamed concert. The sound meters were flashing like they should, but no audio in stream. If I knew the JACK source was broken/unstable I wouldn’t have used it. We did a test stream before the real thing and it had sound.

We had a prestream scene without the jack source, playing a looping video. Then when I switched to the main source with cameras, no sound, but JACK source showed input in the mixer. Restarted obs, reconnected the jack plugs, still no sound. Muted jack source and added an audio device globally (Settings > Audio), which fixed it.

The reason I wanted to use JACK was because I recorded a multitrack recording in the background. JACK reported zero xruns.

kkartaltepe commented 3 years ago

again, please provide replication steps instead of saying "me too".

VennStone commented 3 years ago

again, please provide replication steps instead of saying "me too".

Alright.

  1. Launch jack: jackd -S -R -P 70 -t 1000 -d alsa -r 48000 -p 128 -n 3
  2. Open OBS and create 5 Jack channels to record & set them appropriately in advance mixer.
  3. Connect the outputs to the corresponding inputs with Catia.
  4. Tap that record button, fam.
  5. Get 5 tracks without audio.

I suggest trying this with something more complex than a stereo pair in / out when trying to replicate.

I made a video showing this issue in another report some time back. Out of all the Jack aware application I run in the studio OBS is the only one exhibiting this issue.

The hack it to use pulseaudio-module-jack and route it that way.

It would be nice if it did not fail silently since it gives every indication that it is working until you check the audio after the recording. People have lost data due to that.

kkartaltepe commented 3 years ago

I cannot start jack with the options specified. -S seems invalid though its listed in the man page, and I removed the priority related options for my environment.

Your video shows (fails to show but it is implied) that everything has overrun already on all channels before you even began recording, which suggests devices issues as I have been saying. Since you are also recording directly from the device, something I dont have access either.

I have configured OBS to a similar setup by recording from ardour and cannot replicate this. Again please provide replication steps that do not involve a potentially poorly behaved device

daniel-j commented 3 years ago

Jack settings (Cadence): Screenshot from 2020-11-15 16-26-17

Tried to replicate the issue today with the same setup and hardware, but was unable to. Here are the three logfiles (stream started at 18:54:53) from the time of the concert: https://gist.github.com/daniel-j/7255d429e805c7388336166c0965f5cb (3 log files, in the end I fixed it by adding Jack Source as a global audio device in OBS thru pulseaudio)

Log from today, no audio issues: https://gist.github.com/daniel-j/3817f2146153529021ecb1b308457506

I can't replicate the issue with @LGCW's steps.

Hardware used: Behringer X32 Compact, running in 32x32 mode (JACK uses this, first two input channels was main stream mix, connected to OBS Jack source). Blackmagic Design ATEM Mini Pro for video input. I had a NDI source in a scene, but the iPad I had planned to use as a wireless camera was never used during the concert.

kkartaltepe commented 3 years ago

https://github.com/obsproject/obs-studio/pull/3856 consider applying this PR and seeing if their alternative timestamp computation helps with your hardware. If it does please comment on the PR.

marcan commented 3 years ago

To clear a few things up:

  1. The JACK OBS plug-in is not realtime safe currently, and needs an overhaul (nonblocking buffer added) to be made so. It locks a mutex in the audio thread. So using OBS with the JACK input you currently need to accept that this might cause xruns if you're unlucky. I would never recommend using OBS JACK input on the same machine where you are doing a master multitrack recording. Ideally use a separate machine, but if you must use the same one, indirect it through pulseaudio or something.
  2. The current version of the JACK OBS plug-in has a horrible bug that computes timestamps (completely and utterly) wrong. Depending on the time of day (seriously) this can cause the "max audio buffering" problem, and audio to completely fail after a bad enough xrun. See #3856 for a fix to that, and other JACK problems.
  3. "Max audio buffering" shouldn't actually break your audio fully, or break sources forever, it only does so right now because OBS is buggy. This has silently broken many, many streams over the years, basically afflicting every long-running DJ stream ever with RTMP mixing at some point, among other things. It doesn't just affect JACK. I got tired of this (and I have a DJ stream to run in ~12 hours), so I fixed it; see #3860 for one underlying root cause that can easily trigger this that was already fixed and merged, and #3863 for a proper fix that makes the audio code robust in these situations. I have no idea why this has been a problem for years and the devs have developed this folklore that if you get that message all bets are off and this is an unfixable problem and nothing can be done and it's probably the user's or their hardware's fault somehow. It's a fixable problem, as I fixed it. It also affected other sources, such as laggy RTMP sources and the like. With this fix, even without the JACK timestamp fix, JACK sources continue working and recover in the common case of xruns etc, as well as don't break other sources when something goes wrong.
  4. You probably also want to patch in #3857 to avoid random deadlocks when using pulseaudio monitoring.
  5. The reason why monitoring works and stream output does not is that monitoring takes a completely separate codepath that ignores timestamps, and relies on pulseaudio to mix all streams together, so OBS just feeds audio from point A to point B and if timestamps are off then it doesn't matter, and if the OBS mixing code goes bonkers it doesn't matter, as long as sources are actively feeding audio output of some sort pulseaudio will take it and play it.

I am, quite honestly, disappointed that the OBS devs have chosen to close or not take seriously the (multiple) reports over the years of problems along the lines of "max audio buffering" -> audio completely breaks, asking for replication steps and ignoring anything without them, instead of working with users to root cause and fix what has for a long time been a massive pain point and reliability issue (without obvious repro steps unless you try really hard - took me a while) affecting many, many users over several years, which turned out to largely be attributed to bugs and poor audio source sync handling in the core OBS code.

But hey, it's fixed now, so apply my patches and enjoy while the OBS devs get around to reviewing them :-)

(disclaimer: I won't make the claim that audio in OBS is perfect now... the code is, honestly, quite complicated and confusing, I just went as far as root causing the issue that I mentioned and fixed it in the way that made sense to me, but I wouldn't be surprised if more weird bugs lurk behind the surface)

kkartaltepe commented 3 years ago

Thanks I appreciate the patches. I'm sorry you or any one else feels that trying to understand their symptoms and setups and find a reproduction step on our own machines is not taking a problem seriously. Though I have trouble finding what more you could want, perhaps you would only be satisfied with a solution.

As I understand it the only reason you were able to write these patches is that you could reliably reproduce the issue on your end via custom scripts and a very delicate test environment and which depends on when jack and your system are started. I'm sure you wouldnt wish trying to debug that upon someone who couldnt even reproduce the issue in the first place.

kkartaltepe commented 3 years ago

We are of course always looking for contributors to maintain the Jack plugin, given it was mostly written 5 years ago as someone's only contribution.

VennStone commented 3 years ago

@marcan Thank you for tackling this long standing bug. I'll be testing it next week in the studio.

marcan commented 3 years ago

@kkartaltepe FWIW, sorry not sorry is not a helpful response.

The only thing needed to reproduce the JACK issue is to try using JACK at all within 30 minutes of booting the computer, or to have a JACK hiccup (which can be easily simulated by killall -STOP jackd; sleep 1; killall -CONT jackd). No custom scrips required, no delicate test environment. But this isn't just about the JACK issue, it's about other problems with OBS too.

The reason why I was able to fix the audio death bug is because I am an experienced developer and I am particularly good at jumping into foreign codebases and sending drive-by PRs fixing random bugs that have eluded the developers. This isn't my first rodeo. This one has made the rounds on the usual tech websites a few times, but there are others. It's a thing I do. But the difference between OBS and every other project I've contributed in this way to, is that other projects always had open bugs where the developers were working with users to try to reproduce or isolate the problem in order to fix it. They just hadn't quite found the root cause yet, or the right solution, and that is where I could help out and contribute.

Instead, in OBS we get this:

Your audio chain is out of sync so OBS will drop the desynced audio once it reaches its max buffering. Its recommended you figure out why your audio chain is out of sync.

"Our software is broken (#3879), and that causes our software to break even more (#3863). It is recommended you figure out why our software is broken. \<close bug>" is a completely unhelpful way of handling bugs.

And yet prior (#3021):

Tested and working fine on my end, I will reply to the forum thread. Support requests should not be submitted to Github,

If things work fine on your end, I would recommend offering to ship your computer and entire setup to your users, since that will guarantee they won't run into any problems you yourself don't run into. Otherwise, perhaps consider that just because you can't reproduce a problem doesn't mean it's not a problem.

To add insult to injury, this was originally reported on mantis two years ago, and the reporter had identified the buggy line of code from the get-go, with a detailed explanation of how it is buggy. The OBS developers completely failed to identify the problem (even though the submitter had explained the issue in detail) and offer the correct fix. Any developer, regardless of skill level, should be able to look at that line of code and the provided explanation, and unambiguously conclude that indeed, yes, that code is absolutely broken, no question - that this didn't happen means the developer who replied failed at their job. Then, a year later, the reporter contributed the correct fix in a comment, and there was no response from OBS. This is a complete and utter failure of the bug handling process for a project. There was absolutely no reason why I, nor the people on this thread, nor the people on #3021, should have ever had to run into this bug. It should've been fixed two years ago when it was first identified, reported, and root caused.

Moving on to the more general audio death issue (that this bug triggers, but is a deeper issue), we've seen this happen many times. There is a forum thread full of people running into it. When this many people run into a problem with your software, it is only natural (and polite) to accept that a problem exists, and begin tracking it, even when you do not have a reproducer for it. This is so prevalent that it has crashed many DJ streams, and become infamous within the restreaming DJ community. People have developed various workarounds for it, from externalizing all audio processing by using monitoring feeds into an external mixing app and back into a single OBS source, to just not using OBS. I know at least two developers who are working on writing an OBS replacement of some sort, just because OBS is so unstable that it is not reliable enough to use for serious long-running streamed productions.

And so I tried filing #3069, to begin tracking this seriously - and yet again it was instantaneously closed, with no acknowledgement from the developers that this was obviously a real problem given the forum thread. Then after I commented again, you pointed me at Discord. Yet Discord is a terrible medium for tracking bugs like this. It's fine for discussion, but completely useless for tracking existing issues. That is what bug trackers are for.

Then when I went on Discord, the first thing that happened is you asked for a log, then someone saw the "max buffering" line, and again I got the same "that is your problem right there" non-answer. Again, "our software logged an error and this means it's not our fault" makes zero sense whatsoever. I pointed out how the timing of that line didn't line up with when the actual stream issue started anyway, and then there was some discussion, but again no serious engagement from the devs. No tracking the bug as a real bug. Nobody volunteered to help root cause it. I was on my own and there was still no bug open acknowledging this as a real issue.

When audio sources have different clocks or go out of sync, there are four things that audio mixing software can do:

  1. Use a DLL to dynamically track clock skew, and resample sources to align them in time with the minimum amount of delay. This is what any high-quality mixing software does.
  2. Drop some audio when things go too far out of sync, to put them back in sync. This is ugly, but the bare minimum any software should do.
  3. Cease processing audio for the lagging source altogether. This is unacceptable, but it's better than
  4. Break completely, killing audio output globally.

OBS has spent years with its developers knowing it is doing (3), accepting it as normal, rejecting any suggestion that that behavior is suboptimal or should be improved and pretending it is largely users' or their hardware's fault, while actually being at (4) (inconsistently), and ignoring all user reports that suggested that fact, because they didn't come with perfect repro steps. On top of that, OBS had multiple bugs that are actually triggering the sync issue in the first place, unrelated to users' hardware or software setup.

The bug fixing process starts with root causing and reproduction. You don't get to plug your ears and pretend bugs don't exist until someone serves you a reproducer on a silver platter. Sorry, that's just not how it works, not on any other project. If nobody can reproduce it (no other users), if it only ever happened once, if there's really no way to tickle it again to figure out what went wrong, then you can close the bug with a "sorry, we really have no clue what happened and if it never happens again there is nothing for us to work off of" style comment after some time passes. But closing bugs immediately because they don't come with perfect repro steps that work for you, without even beginning to engage with the user (what if the user submitted repro steps but missed a crucial detail? How are they supposed to know what is different between your environment and theirs so they know what details to include?) is just extremely user-hostile, and makes your project look like it doesn't care about its users in the slightest.

You can do better.

alinsavix commented 3 years ago

I went on Discord, the first thing that happened is you asked for a log, then someone saw the "max buffering" line, and again I got the same "that is your problem right there" non-answer. No tracking the bug as a real bug. Nobody volunteered to help root cause it. I was on my own and there was still no bug open acknowledging this as a real issue.

I mostly agree with your commentary (and indeed there's several "enough users have run across this that it's obvious we have an actual bug" issues that I never could get traction on), but one thing to note: The number of users that come through the discord going "there's a bug, I'm doing everything right, my system is super-powerful and perfect, my config is perfect, this has to be your bug because there's no way its my bug", when that's not at all the case, is pretty staggering. 99% of the people who come through the discord claiming there's an OBS bug... are wrong. (and along the same lines, OBS is complex enough that frequently "I fixed this bug" contributions from devs unfamiliar with the codebase would end up breaking things in some other way, if accepted)

It's not that the OBS team doesn't care, it's that if the (tiny number of) devs did a deep dive into every "bug" that someone was claiming in the discord, they wouldn't have time to actually make changes to OBS. (Doubly so when you consider how few people actually have a complete understanding of the innards of OBS). This is a pretty common situation for any successful open source project (probably any successful commercial one, too). Do things fall through the cracks? Absolutely. Is that a sign of malice or incompetence? No, not really. Could OBS use more than a single full-time developer, so that more things could be more thoroughly investigated? Absolutely. (Feel free to volunteer)

One thing I can tell you, though: In almost 30 years of software development, I've yet to meet a single developer that, after being essentially berated for not managing to find additional hours in days that are of finite length, have been motivated to "do better". If you want to make a case that things have been overlooked or that things need more attention, great, do that. Berating developers that you don't know and haven't worked closely with, whose task list you know nothing of... yeah, don't. I understand the frustration, but... yeah, don't. If you've been in software development as long as you apply, you probably know how demoralizing it is to work your ass off on something, and still have people showing up bitching that you're not doing enough.

I'm no longer involved with the OBS project, and there's certainly no love lost between me and [some of] them, but I can tell you that the people involved -- from frontline support volunteers, all the way up to the single full-time developer -- work their asses off for very little recognition, and very little personal benefit beyond the feel-goods one gets from helping the community. Both the software and the people have their flaws, but 'not caring' or 'not working hard enough' are not in that list.

odinho commented 3 years ago

In almost 30 years of software development, I've yet to meet a single developer that, after being essentially berated for not managing to find additional hours in days that are of finite length, have been motivated to "do better".

This! Even though I am very happy you've fixed this @marcan (actually amaze! :heart: :rocket: :tada:), what @alinsavix is saying above needed to be said. These things having fallen through the cracks is super unfortunate, I've actually had one particularly important stream where the audio dropped out. But I still :heart: all the work that's put in to OBS.

Thank you for saying what I was thinking reading this but not able to say @alinsavix. And thank you sooo much for finally fixing this @marcan.

I think also your point has come well across. Without knowing the devs or volunteers, I've been in a similar-ish situation, and you coming in helping fix this is likely to inspire just by its action. Hopefully they'll read your posts in the best light.

kkartaltepe commented 3 years ago

The only thing needed to reproduce the JACK issue is to try using JACK at all within 30 minutes of booting the computer, or to have a JACK hiccup (which can be easily simulated by killall -STOP jackd; sleep 1; killall -CONT jackd)

And what user has confirmed that your fix actually fixes their issue? It is assumption that your fix actually fixes their issue because like you said your replication is completely different from their replication steps. Solving similar symptoms doesnt mean solving the same problem. Though I do hope given the nature of the bug that your fix will also fix these issues.

I do appreciate that users may feel that their issues are not being taken serious, that we dont acknowledge issues, that we dont track issues, and we accept there are bugs OBS, that we dont volunteer our time to root cause issues, that we reject any possibility our software can be improved, pretend our users are at fault, ignore user reports, are just extremely user-hostile, and look like we don't care about our users in the slightest. Obviously the people in this thread agree with you, as you said I'm sure we can do better. Though I wish you would consider things from the other perspective.

It's important to consider who these mysterious "OBS Developers" you desire so much of are... As mentioned JACK plugin is basically one commit from 2015, you can complain to that developer for their lack of maintenance and response to bug reports. Or feel free to ask us to remove JACK support due to lack of stewardship, or as your acquaintances have done you can always write something yourself. Though personally I'm glad you decided to contribute back instead.

If things work fine on your end, I would recommend offering to ship your computer and entire setup to your users, since that will guarantee they won't run into any problems you yourself don't run into.

I'm personally happy to take shipment of a replica setup if I am unable to reproduce issues with my hardware and you are unable to debug your own hardware. Though it seems no one was offering.

marcan commented 3 years ago

@alinsavix

The number of users that come through the discord going "there's a bug, I'm doing everything right, my system is super-powerful and perfect, my config is perfect, this has to be your bug because there's no way its my bug", when that's not at all the case, is pretty staggering. 99% of the people who come through the discord claiming there's an OBS bug... are wrong.

I've been the public face for free-as-in-beer hobby software with ~10 million users (as well as a big part of associated open source libraries), so I'm well aware of this situation. The thing is, if you let this get to you and support for users who do know what they're doing suffers, you've lost. The failure here was that my ticket was closed when I linked a forum thread with a bunch of people experiencing the same issue. That's the point where it should be fully evident that this isn't a single random, but most likely a bug, and deserves more than an insta-close.

OBS has this rule that github bugs must have repro steps, and I understand the motivation for this (to be able to get those users off your chest easily), but bug trackers aren't olympic competitions. Rules are meant to be broken. Nobody expects all bugs to be treated identically (and if someone actually comes along saying "boohoo why did my bug get closed and not that other one", and they insist after a one-line explanation, just block them - you don't need to waste your time with people like that). Thus, there is no reason to apply the close-hammer to situations where it is clear there is a problem, and not just a half-baked non-actionable report. Regardless of whether we have a difference of opinion on how to deal with the "chaff" of user bug reports, it is this point where I think OBS unambiguously made the wrong call.

Ultimately, for large open source projects, the only way to avoid developer burnout is to have a few non-developers exclusively dedicate themselves to community management, collating common issues (not just from bugs, but also from forum threads and such), and filtering and organizing things for the developers to focus on. Obviously, as a volunteer-driven project, this isn't something you can just make happen, but it's something you can try to recruit people for.

Also, just as a statistical note, Linux users tend to on average know what they're doing better, and submit higher quality bug reports (this is the experience of most multiplatform projects as far as I'm aware). Some of that is clearly evident in this thread, where the affected users all had decent bug reports and comments (nobody noticed the "first 30 minutes from boot" part of the story, but that is hard to realize), and some clearly had better knowledge of specific audio concepts than the OBS developers themselves (e.g. @nettings on how realtime scheduling works, or myself on what a DLL is in this context). So that variable might be worth taking into account when deciding how to give bugs attention given limited developer resources.

(and along the same lines, OBS is complex enough that frequently "I fixed this bug" contributions from devs unfamiliar with the codebase would end up breaking things in some other way, if accepted)

Oh, absolutely. For the JACK and PA bugs the fixes are obvious enough, but for the audio death bug, I'm confident I found the root cause of the bug, and I'm fairly confident my approach to fixing it is the right one (one way or another intermediate mixing must be insulated from lagged source timestamp inputs or else you're asking for trouble), but I certainly can't say the precise implementation in my patch is bug-free, especially since I'm not familiar with the codebase. All I can say is it makes sense to me, and fixed it for me, and worked stably during a 5+ hour stream. Other people still need to review and test it.

That said, and this is personal opinion: I think the current audio pipeline is overcomplicated for what it does, hard to follow, and rather bug-prone with evidence of technical debt and piled on fixes and hacks. I'm not saying this needs to happen, but if I were tasked with improving it, I'd probably do a near full rewrite of all the timestamp/buffering/mixing/etc code (and along the way introduce concepts like clocksources that users can control and select, as this is required for frame- and sample- perfection in serious mixing settings). This is a decently large challenge, but not a herculean effort.

Could OBS use more than a single full-time developer, so that more things could be more thoroughly investigated? Absolutely. (Feel free to volunteer)

I already have a full-time open source project lined up for next year; but if, say, such a serious audio revamp would be appreciated, that would be good to know, in case I find myself in a position to dedicate that amount of time to it at some point.

That said, given the amount of money being made from streams these days... you all need to move your donations more. OBS is way more critical than the $2600 that Jim is getting on Patreon.

If you've been in software development as long as you apply, you probably know how demoralizing it is to work your ass off on something, and still have people showing up bitching that you're not doing enough.

Look, I get what you're saying, but... I think I have a point here that something went wrong. There are always users bitching about nonsense, and I myself have been burned out from communities by that. But I also have never had a situation where my code had long-standing issues that I failed to acknowledge and which were causing significant user pain. In fact, back when I was managing the aforementioned 10M+ user community, part of my ethos was having extreme care for specific classes of problems (in that case it was about potentially bricking user hardware, but I would put OBS crashing streams on a similar level, since it can have a huge audience and significant impact for streamers). And as a result, I can say that I have never had a single report of that happening in the wild that could be traced to a bug in my code. I had one close call, and my extremely defensive coding around it saved the day. It is something that I am proud of to this day.

@kkartaltepe

And what user has confirmed that your fix actually fixes their issue? It is assumption that your fix actually fixes their issue because like you said your replication is completely different from their replication steps. Solving similar symptoms doesnt mean solving the same problem. Though I do hope given the nature of the bug that your fix will also fix these issues.

I'm not sure what you're trying to accomplish with this comment. Given that the code is obviously broken, and that that breakage is fully consistent with all of the reported situations and explains them perfectly, and that my fix is most definitely at least sane in comparison, and that another user had already contributed a near-identical fix a year ago that fixed his problem, I would be surprised if it didn't fix the issue. Sure, it might not on some weird off-chance, and we won't know for sure until people test it, but making this kind of comment just makes it sound like you're trying to grasp at straws to devalue my contribution. Occam's razor says I fixed the bug, no need to throw defensive what-ifs at me. If I didn't then maybe we can find some other bug and fix it too.

It's important to consider who these mysterious "OBS Developers" you desire so much of are... As mentioned JACK plugin is basically one commit from 2015, you can complain to that developer for their lack of maintenance and response to bug reports.

The thing is, if something is unmaintained... the right thing to do is leave the bug reports open until someone looks at them. Tag them "unmaintained" or something. Whatever you need to make sure they don't clog up your stats. But don't close them, because that says there is no bug. And if you at least leave them open, you can dupe-close any repeats (like this one would've been) and thus collect statistics on how many users are hitting the same bug.

Now, as I said, there's also the case that Ochi submitted a fix for this a year ago, and it wasn't merged... so even though the plug-in was unmaintained, a user tried to maintain it for you, and you failed to accept the contribution. That's a problem. If nobody is around to review the patch... and the code is unmaintained... perhaps you should just accept it blindly, as one user testing a fix is better than none. And then if it accidentally breaks things for someone else, you can find out later and revert it - at least you're doing your best to move forward in that case, and I would find that situation much more understandable.

The JACK thing may have been unmaintained code, but the general audio problem wasn't. And really, to put things into perspective on that one: even if the "global audio death" bug weren't there, and we were at my (3) in the list above, the entire idea that desynced audio sources is a user problem is just all kinds of wrong. That's not broken hardware or glitches, that's physics. Commodity quartz crystal oscillators have about a 50ppm accuracy. That means that for any two audio sources with different clocks, which are "perfectly" configured to the same sample rate, you can expect OBS's sync buffer to max out after 5.5 hours on average. OBS developers have been effectively saying "our software is not designed to operate continuously for more than 5.5 hours on a standard environment". And since AIUI OBS's clock is system time, nothing to do with sample clocks, I may be wrong here, but I think this would hit people with a single audio source with sample-linked timestamps too.

Fenrirthviti commented 3 years ago

Since I'm the one who made the call to close the original issue here, let me provide a bit more context on some of this.

The failure here was that my ticket was closed when I linked a forum thread with a bunch of people experiencing the same issue. That's the point where it should be fully evident that this isn't a single random, but most likely a bug, and deserves more than an insta-close.

There's an assumption that we dismissed the user reports and haven't looked in to this in the past, which is not true. We have. We just missed the bug. We made a mistake. We're all human here. Pretty much every time we had gone down this path with a user, it turned out to be an issue with their system and not a bug in OBS. Could the bug in OBS have triggered these conditions more often? Given what we know now, probably. Don't make assumptions that we're dismissing bugs and that we haven't looked in to these issues when you aren't involved in our support and triage process. Yes, I will admit that after seeing the same thing reported a hundred times, and after 99.9 times out of 100 the issue being caused by a faulty device or configuration on the user's system, I started to dismiss reports like this because it had already been (we thought) investigated to death and determined to not be an issue on our end. I mean, christ, in the example thread you linked, half those users solved their issue by fixing their own configuration mistakes. Show a little understanding here.

OBS has this rule that github bugs must have repro steps, and I understand the motivation for this (to be able to get those users off your chest easily), but bug trackers aren't olympic competitions. Rules are meant to be broken. Nobody expects all bugs to be treated identically (and if someone actually comes along saying "boohoo why did my bug get closed and not that other one", and they insist after a one-line explanation, just block them - you don't need to waste your time with people like that). Thus, there is no reason to apply the close-hammer to situations where it is clear there is a problem, and not just a half-baked non-actionable report. Regardless of whether we have a difference of opinion on how to deal with the "chaff" of user bug reports, it is this point where I think OBS unambiguously made the wrong call.

Again, this is a matter of perspective. Hindsight is 20/20 and it's clear to you that there is a bug. It's clear to us that this has been investigated to death and we didn't find anything, and this is another system configuration issue. We missed it. It happens. So no, we didn't "unambiguously" make the wrong call here. Only from your perspective.

Ultimately, for large open source projects, the only way to avoid developer burnout is to have a few non-developers exclusively dedicate themselves to community management, collating common issues (not just from bugs, but also from forum threads and such), and filtering and organizing things for the developers to focus on. Obviously, as a volunteer-driven project, this isn't something you can just make happen, but it's something you can try to recruit people for.

Yes, that is one of my primary responsibilities, as well as a handful of others. We get hundreds, if not thousands, of "bug reports" every day that we have to sift through, and when we're dealing with that much volume, an issue we have already determined to not be something on our end is going to be dismissed in favor of something more pressing. Again, we're human here, and we made a mistake.

Also, just as a statistical note, Linux users tend to on average know what they're doing better, and submit higher quality bug reports (this is the experience of most multiplatform projects as far as I'm aware). Some of that is clearly evident in this thread, where the affected users all had decent bug reports and comments (nobody noticed the "first 30 minutes from boot" part of the story, but that is hard to realize), and some clearly had better knowledge of specific audio concepts than the OBS developers themselves (e.g. @nettings on how realtime scheduling works, or myself on what a DLL is in this context). So that variable might be worth taking into account when deciding how to give bugs attention given limited developer resources.

From your perspective. Not in my experience. :)

Look dude, nobody is arguing we made a mistake here, but I'm really struggling to understand what you want out of us at this point. We've admitted we missed it, we're going to review your code, and we're probably going to merge it because it fixes a problem. What exactly are you trying to accomplish by continuing to berate us for making a mistake here, on a situation you clearly did not have full insight in to?

All in all here, we thank you for your time and efforts spent on it, as it likely wouldn't have been fixed otherwise. That's both the beauty and tragedy of open source, eh?

marcan commented 3 years ago

Pretty much every time we had gone down this path with a user, it turned out to be an issue with their system and not a bug in OBS.

No, pretty much every time you have gone down this path with a user, it turned out the issue could be worked around by improving their configuration.

This is where you've been making a mistake, the whole time, and how this has gone undiscovered for three years. You assumed that users being able to fix the problem themselves means there is no deeper issue in OBS. That is a fallacy.

Could the bug in OBS have triggered these conditions more often? Given what we know now, probably.

That's the entire nature of the bug. The bug isn't that OBS randomly decides to stop processing audio. The bug is that if anything else goes wrong, then OBS falls over. It's a robustness problem.

Yes, I will admit that after seeing the same thing reported a hundred times, and after 99.9 times out of 100 the issue being caused by a faulty device or configuration on the user's system, I started to dismiss reports like this because it had already been (we thought) investigated to death and determined to not be an issue on our end.

But it hadn't been, had it? You just thought it had, because you'd jumped to the wrong conclusions after seeing this a million times, and never stopped to wonder if the idea that source issues having such catastrophic effects made sense or not, and whether that could be indicative of a deeper-seated issue :-)

I mean, christ, in the example thread you linked, half those users solved their issue by fixing their own configuration mistakes.

You mean half those users solved their issue by accidentally stumbling onto a workaround ;-)

Everyone has been doing that. Flying blind, because nobody knew what the real root cause was. But I've seen many theories floating around the DJ restreaming crowd too. That if you get all your sources on the same sample rate it's fine. Whether toggling source visibility fixes it or not. Etc etc. Users will flail around until the problem goes away. Sometimes they'll get lucky and it will for real. Sometimes it won't, but they'll think it did because they just got lucky the next time, since it's a fairly random bug.

I just re-read the forum thread, and not a single one of those replies sounds to me like they fixed an actual configuration issue. Every single one of them, in my head, I can explain as them changing their setup from one that tickles the OBS bug, to one that does not. I cannot conclude a single one of them did anything wrong. All of those stories sound like exactly what I'd expect, knowing what I know from the bug. What you are seeing is not a bunch of users fixing their broken configuration, it's a bunch of users flailing around trying to accidentally discover a workaround for the OBS bug, some succeeding, some not.

Please stop calling them broken configurations. Those are your poor users, who did nothing wrong, working around a software bug that is in no way, shape, or form their fault.

The problem here was falling into the trap of making the assumption that, just because the user could make changes that make the problem not happen, that there is no problem. I can make this problem never happen too - all I have to do is keep my streams to <5 or so hours, so natural drift isn't a problem, and make sure all my sources are squeaky clean and never have the slightest dropout and I don't toggle monitoring too many times. Now, of course, that isn't actually practical... but hey, it'll fix the problem!

You saw the max buffering message. You saw audio falling over when that triggers. You saw that this could be avoided if sources just... never lagged. And so, you concluded, this is a problem with sources - so any time this happened with a user, it was either a user config error or, at worst, a source bug.

This is not good engineering. Engineering is about robustness. Well engineered systems are not those that work when things go right, they're those that keep working when things go wrong.

So first I found a repro - toggling monitoring many times. Yay. Okay. What is this triggering? Increasing buffering. Why? Well, that was a fun debugging stream, where I found that silly return statement typo. Fine. I found a bug. Great.

But I didn't stop there. After that stream, I kept looking. Sure, that bug caused timestamps to slip and buffering to increase. But why does the buffer hitting max cause all audio to fall over? That doesn't make any sense. I knew you guys accepted it as fact of life. But that just seemed fundamentally wrong to me. And so eventually after adding a lot of logging I figured out that the top-level mix function was dropping all audio. Because the timestamps were bad. Which were coming from a top-level transition source. Where they'd trickled down from a scene source. Because it turned out a single bad timestamp left over in a source poisons the whole tree. So let's fix that.

And now we're getting somewhere. Now we're fixing the real root causes. Now we're making OBS a better application :-)

But I didn't stop there. Then there was the JACK bug. Okay, so I fixed that too. That's never going to happen again the same way after that patch. But still... with the old buggy code, things worked (if you started things >30m after boot), until a glitch meant they didn't. The bug meant that timestamps from the JACK source were offset, and that they went insane for a few seconds after a glitch. But then they returned to another (different offset) steady state. Surely we can work with that, right? I mean, it's not going to be pretty when the glitch happens... but why not try to recover? With my original version of the audio mixing fix, the JACK source would not poison the rest of the audio tree, but it still itself died an ugly death if the timestamps glitched.

So I reverted the JACK patch and set out on figuring out how to make it recover, even with that insane timestamp code. And I figured out I could reset the timestamp offset compensation machinery when I hit the buffer-full status in the patch. And now, even if you ignore all my other PRs and just apply the broken audio one, and try to use JACK, it works. And if JACK glitches, you get a few seconds of insanity and crackling and audio cutting in and out as the old timestamp calculation code goes completely haywire as the JACK DLL overshoots and oscillates... and then it slowly settles back. And it fixes itself. And a few seconds later audio is back to normal. Maybe with some delay or something, the timestamps are still insane... but it works.

And now, if any other source does something insane like that with timestamps, but recovers, there's a much better chance it'll keep working. Regardless of whether it's a OBS bug in the source, or a bug in timestamps coming from the source itself (like remote RTMP stuff).

That's reliability. And that's what I strive for in software. Not coincidentally, that was also my job at a certain company a few years ago. Not just fixing problems, but making sure they never happen again, even under adverse circumstances.

I hope that this whole story might change your perspective on how to approach bugs, not just as far as the user perspective, but how to root cause them and make the whole app more robust, instead of fixing only the proximate cause. If you can take this lesson to heart, then I'm sure you will be able to make OBS a much more reliable application in the future :-)

dodgepong commented 3 years ago

I know emotions are a bit high, but I appreciate the work you've done to identify and debug the issue @marcan, and I'm hopeful that the fixes will be put to good use. I appreciate the feedback on how we can do better as a project to respond to people's bug reports. As has been stated already, a big reason this issue has been around for as long as it has is a combination of lack of manpower and lack of expertise in this specific area, so when someone with expertise comes along, it's appreciated when they are able to lend a hand and point out things that we haven't had the ability or time to see.

given the amount of money being made from streams these days... you all need to move your donations more

If you have advice on how to better raise funds to reliably pay people to work on the project, I'm all ears.

marcan commented 3 years ago

@dodgepong Thank you for your feedback :)

It might be worth pointing out that, since #3879 is not merged yet (or rather was merged and reverted), this bug is still applicable... so reopening it might be in order, until that is merged.

And it would probably be helpful if the reporters on this bug can test that PR (and possibly my other open ones) and confirm that it fixes the problem for them too.

If you have advice on how to better raise funds to reliably pay people to work on the project, I'm all ears.

Well... I may have just been thrust into this, er, industry, so I'll let you know if I learn something interesting :-)