pusher / pusher-websocket-swift

Pusher Channels websocket library for Swift
https://pusher.com/channels
MIT License
273 stars 167 forks source link

Starscream crash #115

Closed hamchapman closed 7 years ago

hamchapman commented 7 years ago

See #113.

Summary:

Crashed: com.vluxe.starscream.websocket
0  libobjc.A.dylib                0x18a4faf3c objc_msgSend + 28
1  CoreFoundation                 0x18ba4d5ec _inputStreamCallbackFunc + 56
2  CoreFoundation                 0x18b9f8258 _signalEventSync + 212
3  CoreFoundation                 0x18ba6963c ___signalEventQueue_block_invoke + 24
4  libdispatch.dylib              0x18a939200 _dispatch_call_block_and_release + 24
5  libdispatch.dylib              0x18a9391c0 _dispatch_client_callout + 16
6  libdispatch.dylib              0x18a947444 _dispatch_queue_serial_drain + 928
7  libdispatch.dylib              0x18a93c9a8 _dispatch_queue_invoke + 652
8  libdispatch.dylib              0x18a947940 _dispatch_queue_override_invoke + 360
9  libdispatch.dylib              0x18a94938c _dispatch_root_queue_drain + 572
10 libdispatch.dylib              0x18a9490ec _dispatch_worker_thread3 + 124
11 libsystem_pthread.dylib        0x18ab422b8 _pthread_wqthread + 1288
12 libsystem_pthread.dylib        0x18ab41da4 start_wqthread + 4

Also need to rename the embedded Starscream queue, for example, to com.starscream.websocket.pusher.

zkyronex commented 7 years ago

@hamchapman Really need to know if this issue has been looked at. This bug will affect anyone's confidence to use the library in their app as a reliable communication option. I am currently looking into this as well.

My experience is it crashes from com.apple.root.default-qos.overcommit and not on the main thread. Investigating according to this lead: http://stackoverflow.com/questions/27948618/consistent-dispatch-queue-com-apple-root-default-qos-overcommit-crash

Any follow up on this would be greatly appreciated. Cheers!

hamchapman commented 7 years ago

@zkyronex have you experienced the same issue?

I've tried to recreate it but have been unable to do so and so I'm really just waiting for more information.

I'm going to make a branch that updates to the latest version of Starscream on master and also change the queue name that the Pusher library uses for the underlying Starscream dispatch queue.

zkyronex commented 7 years ago

@hamchapman I have. It happened when I moved back and forth rapidly between 2 to 3 ViewControllers which have their own Pusher instances and subscriptions. However, I noticed the crash occurred when I was not using the app directly. After that I tried to replicate the bug in both my app and the iOS example sample project but no luck.

Hope we can shine some light on this issue sooner than later so everyone can start using it confidently. Thanks.

QuimeraKoke commented 7 years ago

Hi, I tried to debug the code, and I found that Reachability class looks like https://github.com/ashleymills/Reachability.swift library, so I added a DispatchQueue.main.async inside the whenReachable and whenReachable blocks. By other hand reading the traceback and the documentation about [unowned self], so inside an asynchronous method we want to retain the self to use it as in the next lines, so without it there's no bad access to reachability variable in the block. After those changes , everythink looks work fine. I sent a pull request anyway Thanks.

hamchapman commented 7 years ago

I'm not sure I understand why adding the DispatchQueue.main.async means that you can remove [unowned self] without creating a memory leak? Doesn't the PusherConnection object reference Reachability object and the Reachability object reference the PusherConnection object?

And how did you you debug that this is related to this Starscream-related crash?

Thanks for the work - I'm just trying to understand it before merging anything!

QuimeraKoke commented 7 years ago

Hi, By adding DispatchQueue.main.async the code inside the block it will be run async and I think is not a good idea use [unowned self] in async blocks because it could loose the reference to self inside the block. I don't know if it will create a memory leak if you use the self inside a block instead of assignament into a property, I will try to meassuere it. You are right is not the solution for this one is for the #113, sorry.

ichibod commented 7 years ago

With the latest code, we're getting a lot of Starscream.swift crashes. It looks like that cleanupStream method is having a hard time with the streams being available, despite the checks. I can send you the Fabric reports if you're interested in investigating. Thanks!

hamchapman commented 7 years ago

@ichibod that would be great if you can send me the reports!

Email might be the best bet.

ichibod commented 7 years ago

@hamchapman Sent! Let me know if I can provide any other info!

eoinoconnell commented 7 years ago

@hamchapman Hey, any update on this?

hamchapman commented 7 years ago

Not really. I took a look through the crashes that @ichibod kindly sent over, but there was nothing that stood out there.

@eoinoconnell are you seeing lots of these crashes? Are there any patterns / specific circumstances that lead to one of these crashes?

eoinoconnell commented 7 years ago

Hi @hamchapman I had a look there a little more, so we had the crash previously but very infrequently. In the latest release we had a much larger number ~ 10% of user base, looking at what changed we introduced an SDK that has its own dependency on starscream, I wonder is it to do with both sharing the same queue com.vluxe.starscream.websocket

hamchapman commented 7 years ago

That definitely sounds like it's a potential culprit. Can I ask what the other SDK was that you added that has a dependency on Starscream? It might help making reproducing the issue easier!

eoinoconnell commented 7 years ago

So we added https://github.com/sjrmanning/Birdsong which looks like uses starscream 2.0.1

hamchapman commented 7 years ago

Great - thanks for the info @eoinoconnell. I'll try and find some time to test out using both SDKs simultaneously to see if I can recreate a crash. Thanks!

eoinoconnell commented 7 years ago

No problemo. Let me know if you need anything else from my end

eoinoconnell commented 7 years ago

@hamchapman I actually had to do a release so I tried changing that queue and didn't have an effect btw. Interesting metric though, fabric is reporting only 6% of the crashes are when 'App is in Focus'

hamchapman commented 7 years ago

Hmmm interesting. Sounds like it could mainly be occurring when disconnects are involved. That matches up with what some of the other commenters on this thread were saying (e.g. mentioning cleanupStream).

dtochetto commented 7 years ago

Similar here, from last updates we start to see an increased in crashes from pusher swift caused by Starscream.swift

Here two traces:

Crashed: com.vluxe.starscream.websocket
0  CoreFoundation                 0x182957404 CFHash + 352
1  CoreFoundation                 0x182953eec CFBasicHashFindBucket + 164
2  CoreFoundation                 0x182953df4 CFDictionaryGetValue + 164
3  CoreFoundation                 0x1829ca2d8 _CFStreamUnscheduleFromRunLoop + 184
4  CoreFoundation                 0x182a3a120 _CFStreamSetDispatchQueue + 168
5  PusherSwift                    0x1015c3e48 WebSocket.cleanupStream() -> () (Starscream.swift:732)
6  PusherSwift                    0x1015cf9ec specialized WebSocket.stream(Stream, handle : Stream.Event) -> () (Starscream.swift:713)
7  PusherSwift                    0x1015c3d88 @objc WebSocket.stream(Stream, handle : Stream.Event) -> () (Starscream.swift)

Crashed: com.apple.main-thread
0  CoreFoundation                 0x191d53404 CFHash + 352
1  CoreFoundation                 0x191d4feec CFBasicHashFindBucket + 164
2  CoreFoundation                 0x191d4fdf4 CFDictionaryGetValue + 164
3  CoreFoundation                 0x191dc62d8 _CFStreamUnscheduleFromRunLoop + 184
4  CoreFoundation                 0x191e36120 _CFStreamSetDispatchQueue + 168
5  PusherSwift                    0x1015b3e14 WebSocket.cleanupStream() -> () (Starscream.swift:728)
6  PusherSwift                    0x1015bba30 specialized WebSocket.initStreamsWithData(Data, Int) -> () (Starscream.swift:713)
7  PusherSwift                    0x1015b3bc4 WebSocket.createHTTPRequest() -> () (Starscream.swift:561)
8  PusherSwift                    0x10158f07c PusherConnection.connect() -> () (PusherConnection.swift)
9  PusherSwift                    0x10158f200 @objc PusherConnection.connect() -> () (PusherConnection.swift)
hamchapman commented 7 years ago

@dtochetto thanks for those! I think they confirm that the fix for this has been added to Starsream as of the 2.1.1 release (https://github.com/daltoniam/Starscream/issues/365).

This is included in the swift4 branch of pusher-websocket-swift: #149

Are you able to use the swift4 branch to confirm it is a fix or shall I backport the dependency update to a Swift 3.x branch?

ichibod commented 7 years ago

@hamchapman this would definitely be helpful for pre 4.0, as I don't think we'll be able to upgrade to 9 or Swift 4 until after launch 😢

dtochetto commented 7 years ago

@hamchapman like ichibod will be great if you could implement a backport dependency to swift 3. we are planning a swift 4 migration but not in the short time.

hamchapman commented 7 years ago

This got auto closed by the commit I made earlier, but just to be explicit: I released 4.2.1 today which includes the fix and works with Swift 3.

Let me know if any problems remain and thanks everyone for all the debug help.

artemtkachenko commented 7 years ago

@hamchapman

I have updated pusher dependency in the project: pod 'PusherSwift', :git => "https://github.com/pusher/pusher-websocket-swift", :branch => "swift-4", :commit => 'e64bf6f9b97cba41e7c50b98a6a843e5b5c16eb3'

and this issues happened again:

Crashed: com.vluxe.starscream.websocket
0  CoreFoundation                 0x19193b404 CFHash + 352
1  CoreFoundation                 0x191937eec CFBasicHashFindBucket + 164
2  CoreFoundation                 0x191937df4 CFDictionaryGetValue + 164
3  CoreFoundation                 0x1919ae2d8 _CFStreamUnscheduleFromRunLoop + 184
4  CoreFoundation                 0x191a1e120 _CFStreamSetDispatchQueue + 168
5  PusherSwift                    0x100a17018 (Missing)
6  PusherSwift                    0x100a24e28 (Missing)
7  PusherSwift                    0x100a16f90 (Missing)
8  CoreFoundation                 0x1919ae01c _signalEventSync + 212
9  CoreFoundation                 0x191a1e5c4 ___signalEventQueue_block_invoke + 24
10 libdispatch.dylib              0x19091a9e0 _dispatch_call_block_and_release + 24
11 libdispatch.dylib              0x19091a9a0 _dispatch_client_callout + 16
12 libdispatch.dylib              0x190928ad4 _dispatch_queue_serial_drain + 928
13 libdispatch.dylib              0x19091e2cc _dispatch_queue_invoke + 884
14 libdispatch.dylib              0x190928fa8 _dispatch_queue_override_invoke + 344
15 libdispatch.dylib              0x19092aa50 _dispatch_root_queue_drain + 540
16 libdispatch.dylib              0x19092a7d0 _dispatch_worker_thread3 + 124
17 libsystem_pthread.dylib        0x190b23100 _pthread_wqthread + 1096
18 libsystem_pthread.dylib        0x190b22cac start_wqthread + 4

and

Crashed: com.apple.main-thread
0  CoreFoundation                 0x181e2a940 CFHash + 360
1  CoreFoundation                 0x181e277e0 CFBasicHashFindBucket + 164
2  CoreFoundation                 0x181e27720 CFDictionaryGetValue + 224
3  CoreFoundation                 0x181ea08d4 _CFStreamUnscheduleFromRunLoop + 184
4  CoreFoundation                 0x181f1e89c _CFStreamSetDispatchQueue + 168
5  PusherSwift                    0x101a4e918 _T011PusherSwift9WebSocketC13cleanupStream33_4D4805C375144E851D839A8306929418LLyyF + 108
6  PusherSwift                    0x101a584a8 _T011PusherSwift9WebSocketC19initStreamsWithData33_4D4805C375144E851D839A8306929418LLy10Foundation0H0V_SitFTf4gXnn_n + 80
7  PusherSwift                    0x101a4e7d8 _T011PusherSwift9WebSocketC17createHTTPRequest33_4D4805C375144E851D839A8306929418LLyyF + 2536
8  PusherSwift                    0x101a28d18 _T011PusherSwift0A10ConnectionC7connectyyF + 524
9  PusherSwift                    0x101a28ea0 _T011PusherSwift0A10ConnectionC7connectyyFTo + 28
10 Foundation                     0x182958d24 __NSFireTimer + 88
11 CoreFoundation                 0x181f1099c __CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION__ + 28
12 CoreFoundation                 0x181f106c0 __CFRunLoopDoTimer + 864
13 CoreFoundation                 0x181f0fec0 __CFRunLoopDoTimers + 248
14 CoreFoundation                 0x181f0daa8 __CFRunLoopRun + 1928
15 CoreFoundation                 0x181e2e2d8 CFRunLoopRunSpecific + 436
16 GraphicsServices               0x183cbff84 GSEventRunModal + 100
17 UIKit                          0x18b3db880 UIApplicationMain + 208
18 ****                       0x1010b7c70 main (AppDelegate.swift:21)
19 libdyld.dylib                  0x18195256c start + 4

Maybe it my bad, because of using :git=> '' :branch=> '' :commit=> '' ?

hamchapman commented 7 years ago

I assume having both branch and commit in your Podfile leads to Cocoapods taking the branch as the first choice.

The swift4 branch is now lagging behind master and should be removed. It doesn't have the commit with SHA e64bf6 on it. If you write the line in your Podfile as:

pod 'PusherSwift', '~> 5.0.1'

then it should work fine.

artemtkachenko commented 7 years ago

Thank you, @hamchapman

dtochetto commented 7 years ago

@hamchapman I released a build some weeks ago and I'm seeing again this crash for version 5.0.0

Crashed: com.vluxe.starscream.websocket
0  CoreFoundation                 0x1866e4620 CFHash + 360
1  CoreFoundation                 0x1866e14c0 CFBasicHashFindBucket + 164
2  CoreFoundation                 0x1866e1400 CFDictionaryGetValue + 224
3  CoreFoundation                 0x18675a8a4 _CFStreamUnscheduleFromRunLoop + 184
4  CoreFoundation                 0x1867d882c _CFStreamSetDispatchQueue + 168
5  PusherSwift                    0x1057c704c WebSocket.cleanupStream() -> () (Starscream.swift:743)
6  PusherSwift                    0x1057d4e28 specialized WebSocket.stream(Stream, handle :   Stream.Event) -> () (Starscream.swift:723)
7  PusherSwift                    0x1057c6f90 @objc WebSocket.stream(Stream, handle : Stream.Event) -> () (Starscream.swift)
Crashed: com.vluxe.starscream.websocket
0  CoreFoundation                 0x183bc8620 CFHash + 360
1  CoreFoundation                 0x183bc54c0 CFBasicHashFindBucket + 164
2  CoreFoundation                 0x183bc5400 CFDictionaryGetValue + 224
3  CoreFoundation                 0x183c3e8a4 _CFStreamUnscheduleFromRunLoop + 184
4  CoreFoundation                 0x183cbc82c _CFStreamSetDispatchQueue + 168
5  PusherSwift                    0x105793018 WebSocket.cleanupStream() -> () (Starscream.swift:739)
6  PusherSwift                    0x1057a0e28 specialized WebSocket.stream(Stream, handle : Stream.Event) -> () (Starscream.swift:723)
7  PusherSwift                    0x105792f90 @objc WebSocket.stream(Stream, handle : Stream.Event) -> () (Starscream.swift)
hamchapman commented 7 years ago

It does indeed look like the problem remains: https://github.com/daltoniam/Starscream/issues/422