Line 341 is if(_inputStream.streamStatus == NSStreamStatusNotOpen) { —
it's strange that it would call CFStreamSetDispatchQueue, so the line
number is probably incorrect. The crash could be reproduced by
foregrounding and backgrounding the app quickly (in ~1 second) a few
times; it happened on line 338: CFWriteStreamSetDispatchQueue((__bridge CFWriteStreamRef)_outputStream, _workQueue);, which seems more correct.
The analysis of the disassembled _CFStreamScheduleWithRunLoop showed
the line 0x1fbc01e58 <+60>: ldr x28, [x19, #0x30] and the
exception was Thread 35: EXC_BAD_ACCESS (code=1, address=0x30), so the
x19 register must be zero. x19 is set to x0 a few instructions
above, where x0 is the first argument (according to the "Procedure
Call Standard for the ARM 64-bit Architecture"). This pointed to
and confirmed the problem: CFWriteStreamSetDispatchQueue crashes when
the stream parameter is nil. So when the socket is disconnected in
the middle of -connect, _inputStream and/or _outputStream become
nil and cause the crash.
The fix is to check if the streams are no longer valid, and if so, stop.
To avoid a race condition when the streams may become nil right after
the check, they are retained strongly for the rest of the -connect
method.
A unit test to reproduce the crash is possible (setup a server socket,
then connect to and disconnect from it without a delay multiple times
(~500) in a loop), but not included here because:
it's very unreliable and ugly — reproduced this crash in ~5% of runs
and required as much CPU usage as possible to slow down the socket
processing to mess with the concurrency;
it caused at least three other types of crashes during the stress
test;
it required changes in the -connect method to add artificial delays
to force the crash.
A crash report of an application using
PocketSocket
shows this stacktrace:Line 341 is
if(_inputStream.streamStatus == NSStreamStatusNotOpen) {
— it's strange that it would callCFStreamSetDispatchQueue
, so the line number is probably incorrect. The crash could be reproduced by foregrounding and backgrounding the app quickly (in ~1 second) a few times; it happened on line 338:CFWriteStreamSetDispatchQueue((__bridge CFWriteStreamRef)_outputStream, _workQueue);
, which seems more correct. The analysis of the disassembled_CFStreamScheduleWithRunLoop
showed the line0x1fbc01e58 <+60>: ldr x28, [x19, #0x30]
and the exception wasThread 35: EXC_BAD_ACCESS (code=1, address=0x30)
, so thex19
register must be zero.x19
is set tox0
a few instructions above, wherex0
is the first argument (according to the "Procedure Call Standard for the ARM 64-bit Architecture"). This pointed to and confirmed the problem:CFWriteStreamSetDispatchQueue
crashes when thestream
parameter isnil
. So when the socket is disconnected in the middle of-connect
,_inputStream
and/or_outputStream
becomenil
and cause the crash.The fix is to check if the streams are no longer valid, and if so, stop. To avoid a race condition when the streams may become
nil
right after the check, they are retained strongly for the rest of the-connect
method.A unit test to reproduce the crash is possible (setup a server socket, then connect to and disconnect from it without a delay multiple times (~500) in a loop), but not included here because:
it's very unreliable and ugly — reproduced this crash in ~5% of runs and required as much CPU usage as possible to slow down the socket processing to mess with the concurrency;
it caused at least three other types of crashes during the stress test;
it required changes in the
-connect
method to add artificial delays to force the crash.