Closed dapetcu21 closed 4 years ago
I tried running the leak checker in Instruments, and it didn't catch anything in this area. It's possible #2264 might help magically, but I doubt it. If this only started with Catalina, it's possible Apple introduced a regression.
We'll go to Metal one day, and this won't be a problem at that point.
Can someone test this again on 25.0.7 and validate if the bug is still present? I understand it's a bit of an involved test, but from @jpark37's comments it might not be something we can fix.
I'm running a stream with looped footage right now on 25.0.7. It's been 2 days and it didn't crash. I'll see how long this will last.
For information: I'm a developper of MadMapper, we're also doing OpenGL on macOS since years. We also use Qt for UI & OpenGL contexts (version Qt 5.12.3 in current versions). Since February 2020 I received 5 crashes like yours, all of them on macOS 10.15.1 or later. Same thing: users who don't set an auto restart and let the software run for weeks. It's not a leak, the refcount of an object is always being incremented and reached the maximum value (MAX_INT ? or maybe it's 16 bits...). Since Qt doesn't use ARC (auto ref counting) it might call "retain“ each time we make a context current, but never call "release" for instance, but I've been through qcocoa files and couldn't find anything obvious. And why would that only happen from macOS 10.15 ? They were just accepting "unbalanced retains" before and became more strict ?
I added this check in qcocoaglconotext.mm and it breaks after a very short time:
if (m_context.view) {Q_ASSERT(CFGetRetainCount(m_context.view) < 1000);}
The problem also happens with Qt 5.12.8. I'll check it persists in 5.14 or 5.15, if it does, I'll have to find where the ref count is being increased. But in Instruments / Allocations if I look at the NS View objects, I see "Ref Count" is 1. I'm not used to the Allocations tool, maybe I'm not looking at the right place...
OBS 25 is built with 5.14.
I fixed the issue in Qt: https://github.com/mattbeghin/qtbase/commit/6e95d37de88bbad406b40bfc476b2108a4601e2d Each call to makeCurrent on a QOpenGLContext that is not the current one for the calling thread will increase by one the ref count of the NSView actually attached to the cocoa context. I just replaced "m_context.view" with "[m_context view]" You can vote for the issue I posted on Qt bugtracker: https://bugreports.qt.io/browse/QTBUG-84762
I fixed the issue in Qt: mattbeghin/qtbase@6e95d37
Could you please follow Qt's contribution process for submitting your patch to Qt? They should be able to help you with that if you have any questions.
Commenting here too, in case it might help others:
Most likely this is due to not having an auto-release pool in the thread that you're calling makeCurrent on. The main thread has a pool that is drained as part of each runloop. The retains done inside Qt and AppKit are often paired with autorelease calls, which will not result in releases unless you also drain the pool by letting the it go out of scope. Qt can mitigate this somewhat by adding local auto-release pools inside Qt to catch any autoreleases down-stack, but there's likely to be corner cases we don't catch. See also: https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/MemoryMgmt/Articles/mmAutoreleasePools.html#//apple_ref/doc/uid/20000047-1041876-CJBFEIEG
From what I can tell Qt isn't involved in the thread that crashed either:
Thread 3 Crashed:: libobs: graphics thread 0 com.apple.AppKit 0x00007fff3722d20a -[NSResponder _tryRetain] + 92 1 libobjc.A.dylib 0x00007fff6fa9fe1d objc_loadWeakRetained + 351 2 libobjc.A.dylib 0x00007fff6faa1adc objc_loadWeak + 15 3 com.apple.AppKit 0x00007fff36e9b494 -[NSOpenGLContext makeCurrentContext] + 271 4 libobs.0.dylib 0x0000000108384d20 gs_enter_context + 112 5 libobs.0.dylib 0x00000001083f6bcf obs_graphics_thread + 1855 6 libsystem_pthread.dylib 0x00007fff71014d36 _pthread_start + 125 7 libsystem_pthread.dylib 0x00007fff7101158f thread_start + 15
I threw up a quick prototype if anyone wants to try it. I'll test it myself when I get off work today.
I made a minimal app reproducing the issue attached to the qt bug report: https://bugreports.qt.io/browse/QTBUG-84762 I can reproduce it with any Qt version. Each makeCurrent done on a context not current in calling thread will increase ref count of the NSView by one. In my example, I just swap two contexts in the main thread. My patch in Qt solves that.
I ended up testing the autoreleasepool PR over lunch instead. Seems solid, but I'm not 100% sure it will fix the original issue.
The issue in Qt has been solved, but I don't think that's relevant for the crashes seen in this issue, and @jpark37's patch is likely needed.
Expected Behavior
I should be able to leave my computer stream indefinitely without a crash.
Current Behavior
After about 4-5 days of continuous streaming, it crashes in
-[NSOpenGLContext makeCurrentContext]
with the following message:Refcount overflow in NSResponder or subclass. Too many unbalanced -retains!
.Sounds like a missing
release
call somewhere.Steps to Reproduce
Additional information
I'm running a Hackintosh with a Radeon RX 580. The computer is set to not go to sleep, but the displays are going to sleep.
Here's a crashlog: