blockset-corp / walletkit

MIT License
80 stars 46 forks source link

Fix crash when BRBitcoinPeer disconnects fast #410

Open cryptodev100 opened 2 years ago

cryptodev100 commented 2 years ago

Dear Blockset maintainers,

I was playing around with the great functionality of walletkit, but then I stumbled upon a strange crash. The crash happened especially often when trying to connect to a BitcoinPeer in iPhone-airplane-mode, but it also sometimes happened when my iPhone was not in airplane-mode. My investigations made it clear that I was dealing with a race-condition and a potential use-after-free.

Therefore, this PR fixes a suspected race-condition when a BRBitcoinPeer disconnects within a very short time. I don't have a minimum reproducible sample, but I will try to explain how I discovered this crash:

It starts with connecting to BitCoin-peers in peer-2-peer-only-mode, when we reach btcPeerConnect within BRBitcoinPeer.c. btcPeerConnect spawns a new thread that executes _peerThreadRoutine. _peerThreadRoutine tries to establish a socket, when it fails to establish a socket, then it will eventually call _peerDisconnected and exit the previously created thread.

An "info-struct" gets passed to _peerDisconnected. _peerDisconnected frees members of the info-struct and does some cleanup-stuff. However, I suspect that _peerDisconnected might be called twice for the same info-struct. Let me show you the two callsites of _peerDisconnected. The first callsite is the following code-snippet, which has a suspicious locking because manager->lock gets unlocked and then a few lines later locked again within _peerDisconnected.

                if (btcPeerConnectStatus(info->peer) == BRPeerStatusDisconnected) {
                    pthread_mutex_unlock(&manager->lock);
                    _peerDisconnected(info, ENOTCONN);
                    pthread_mutex_lock(&manager->lock);
                    manager->peerThreadCount--;
                }

The second callsite is a threadCleanup-function of _peerThreadRoutine, as shown in the following code-snippet:

static void *_peerThreadRoutine(void *arg)
{
    BRBitcoinPeerContext *ctx = arg;

    pthread_cleanup_push(ctx->threadCleanup, ctx->info);

    // Do stuff, attempt to establish socket...

    if (ctx->disconnected) ctx->disconnected(ctx->info, error);
    pthread_cleanup_pop(1); // -> seems to invoke _peerDisconnected via the function-pointer ctx->threadCleanup
    return NULL;
}

Now my suspection is that those callsites are racing against each other. In particular, I observed that _peerDisconnected got invoked with a garbage-info-struct that led to segmentation-faults within btcPeerFree.

Can you confirm that there are troubles with the multithreaded cleanup of BRBitcoinPeers?

cryptodev100 commented 2 years ago

This PR might be related to https://github.com/breadwallet/breadwallet-core/pull/295, but https://github.com/breadwallet/breadwallet-core/pull/295 is already outdated and I only tested with new code.