Bouke / HAP

Swift implementation of the Homekit Accessory Protocol
https://boukehaarsma.nl/HAP/
MIT License
364 stars 50 forks source link

Workaround for thread sync violation in disconnectPairings #83

Closed gbrooker closed 5 years ago

gbrooker commented 5 years ago

Workaround for thread sync violation in disconnectPairings.

In disconnectParings, close() is called from a synchronised block on the channelsSyncQueue, which propagates down to a call to channelInactive(). channelInactive then attempts to run a block synchronously on the same queue, which causes a hard crash.

During testing I found that channelInactive() was called multiple times from other NIO queues, but was only called from the channelsSyncQueue when removing pairings.

I don't know enough about NIO to know if the call to close() is required to be run on the channelsSyncQueue. I suspect that it should not, and therefore the correct fix would be to make that call on a different queue.

The workaround uses an undocumented trick to determine whether channelInactive is being called on the channelsSyncQueue and if it is, then the block is executed directly. If it is being called from a different queue then the block is executed synchronously as originally intended.

The trick works in macOS and iOS, but I do not know if it works with the linux implementation of GCD.

gbrooker commented 5 years ago

This is indeed not building on linux. I do not know the equivalent call for determining the queue label on linux's GCD implementation. Nevertheless, potentially this fix is not needed if the root cause, calling close() on the channelsSyncQueue is corrected.

gbrooker commented 5 years ago

I've changed the test for a queue label which didn't work under linux to testing a queue attribute. The queue attribute mechanism is pretty flawed in GCD; it checks only pointer addresses which is highly dangerous, but it appears to work for this issue.

@Bouke Could you take a look at disconnectPairings to see if it is feasible to call close() from another NIO queue than channelsSyncQueue ? If so, this workaround is no longer required.

Bouke commented 5 years ago

You could try something like this:

            let channel = channels[channelIdentifier]!
            channel.pipeline.eventLoop.execute {
                channel.close(mode: CloseMode.all)
            }

I haven't tried yet, as I'm not sure what triggers this condition. Reading from the dictionary needs to happen on the synchronisation queue, but the actual closing probably not.

gbrooker commented 5 years ago

I simply moved the call to close() out of the synchronous block. This has resolved the crash on removing pairings, and does not appear to cause any threading issues.

This is no longer a workaround but a terminating fix for the problem.