livekit / client-sdk-swift

LiveKit Swift Client SDK. Easily build live audio or video experiences into your mobile app, game or website.
https://livekit.io
Apache License 2.0
176 stars 85 forks source link

Stack overflow crash caused by recursive deallocation in `livekit.multicast` thread #299

Closed maxbrunsfeld closed 1 month ago

maxbrunsfeld commented 6 months ago

Describe the bug

When leaving a Room with several participants, our application is occasionally crashing. It looks like process is being killed with SIGILL. Whenever this happens the livekit.multicast thread always has a very deep call stack that looks like this:

Thread 38::  Dispatch queue: livekit.multicast
0   CoreFoundation                         0x18a3d7d0c _CFRelease + 8
1   zed                                    0x102d07914 @objc InboundRtpStreamStatistics.__ivar_destroyer + 40
2   libobjc.A.dylib                        0x189e6b628 object_cxxDestructFromClass(objc_object*, objc_class*) + 116
3   libobjc.A.dylib                        0x189e62e84 objc_destructInstance + 80
4   libobjc.A.dylib                        0x189e62e28 _objc_rootDealloc + 80
5   libobjc.A.dylib                        0x189e6b628 object_cxxDestructFromClass(objc_object*, objc_class*) + 116
6   libobjc.A.dylib                        0x189e62e84 objc_destructInstance + 80
7   libobjc.A.dylib                        0x189e62e28 _objc_rootDealloc + 80
8   libobjc.A.dylib                        0x189e6b628 object_cxxDestructFromClass(objc_object*, objc_class*) + 116
9   libobjc.A.dylib                        0x189e62e84 objc_destructInstance + 80
10  libobjc.A.dylib                        0x189e62e28 _objc_rootDealloc + 80

... 500 more stack frames, continuing this pattern...

505 libobjc.A.dylib                        0x189e62e28 _objc_rootDealloc + 80
506 libobjc.A.dylib                        0x189e6b628 object_cxxDestructFromClass(objc_object*, objc_class*) + 116
507 libobjc.A.dylib                        0x189e62e84 objc_destructInstance + 80
508 libobjc.A.dylib                        0x189e62e28 _objc_rootDealloc + 80
509 libobjc.A.dylib                        0x189e6b628 object_cxxDestructFromClass(objc_object*, objc_class*) + 116
510 libobjc.A.dylib                        0x189e62e84 objc_destructInstance + 80

Does anyone know if there is ever some linked-list-like structure that would be deallocated on the livekit.multicast thread, such that many deeply nested destructors would be called?

SDK Version

1.1.4

iOS/macOS Version Various macOS versions. One example occurred on macOS 14.1.2. Another occurred on macOS 13.x

Steps to Reproduce Unfortunately, I don't know how to reliably reproduce this. It happens when leaving a Room. I think it happens more often when leaving a Room with several participants, so there are several audio tracks.

Expected behavior No crash.

Screenshots N/A

Logs N/A

maxbrunsfeld commented 6 months ago

Note that the stack trace on the livekit.multicast thread is similar to the one in this other open issue: https://github.com/livekit/client-sdk-swift/issues/257. I decided to open a separate issue because the Zombie profiler (and multiple class implementations) is not relevant to my case.

hiroshihorie commented 6 months ago

Might be related to statistics timer, maybe some kind of retain cycle occurring... Will investigate, thanks for the report.

davidzhao commented 6 months ago

Hi @maxbrunsfeld! We are big fans of your counterfeiter lib.

hiroshihorie commented 5 months ago

My guess is that it's related to DispatchQueueTimer which i patched already in v1, or some issue when the TrackStatistics get swapped with the new value.

Do you read the Track.statistics property at all ?

hiroshihorie commented 5 months ago

I've released v1.1.5, can you try it out ?

maxbrunsfeld commented 5 months ago

Do you read the Track.statistics property at all ?

We don't read that property.

We can try out v1.1.5 in a couple of weeks. For the moment, we have downgraded to 1.0.12, as that's the version we were using before upgrading to 1.1.4.

hiroshihorie commented 1 month ago

Hello, v2.0.8 has different lock mechanism now, I will close this. Please re-open if issue persists .