emqx / CocoaMQTT

MQTT 5.0 client library for iOS and macOS written in Swift
https://www.emqx.com/en
Other
1.57k stars 411 forks source link

Fixes data race crash when write result is reported. #515

Closed przemyslaw-szurmak closed 1 year ago

przemyslaw-szurmak commented 1 year ago

Even though queue is assigned to FoundationConnection it's never used for write results like it's used for reads.

In app I work on we've observed quite significant number of crashes coming from WriteItem.scheduledWrites.remove() which is builtin method of Set.

Screenshot 2023-02-01 at 16 59 11

After bit of analysis (and usage of Thread Sanitiser) I discovered that these crashes were caused by concurrent modification of this Set inside of CocoaMQTTWebSocket.write() method.

public func write(_ data: Data, withTimeout timeout: TimeInterval, tag: Int) {
        internalQueue.async {
            let newWrite = WriteItem(tag: tag, timeout: (timeout > 0.0) ? .now() + timeout : .distantFuture)
            self.scheduledWrites.insert(newWrite) // MODIFICATION
            self.checkScheduledWrites()
            self.connection?.write(data: data) { possibleError in
                if let error = possibleError {
                    self.closeConnection(withError: error)
                } else {
                    guard self.scheduledWrites.remove(newWrite) != nil else { return } // MODIFICATION
                    guard let delegate = self.delegate else { return }
                    delegate.socket(self, didWriteDataWithTag: tag)
                }
            }
        }
    }

First part, where newWrite is inserted to Set is executed on internalQueue which is fine. However, when connection.write gives results back it does that on URLSession internal queue -> it is possible that remove will be called at the same time as insert if there's multiple writes scheduled.

important I've never tested this case but I have a feeling that if someone won't use FoundationConnection but instead StarscreamConnection it can happen as well because provided queue isn't used for write too.

leeway1208 commented 1 year ago

Hi - Great idea! Lets try.