I was playing around with the great functionality of walletkit, but then I stumbled upon a strange crash.
The crash happened especially often when trying to connect to a BitcoinPeer in iPhone-airplane-mode, but it also sometimes happened when my iPhone was not in airplane-mode.
My investigations made it clear that I was dealing with a race-condition and a potential use-after-free.
Therefore, this PR fixes a suspected race-condition when a BRBitcoinPeer disconnects within a very short time.
I don't have a minimum reproducible sample, but I will try to explain how I discovered this crash:
It starts with connecting to BitCoin-peers in peer-2-peer-only-mode, when we reach btcPeerConnect within BRBitcoinPeer.c.
btcPeerConnect spawns a new thread that executes _peerThreadRoutine.
_peerThreadRoutine tries to establish a socket, when it fails to establish a socket, then it will eventually call _peerDisconnected and exit the previously created thread.
An "info-struct" gets passed to _peerDisconnected.
_peerDisconnected frees members of the info-struct and does some cleanup-stuff.
However, I suspect that _peerDisconnected might be called twice for the same info-struct.
Let me show you the two callsites of _peerDisconnected.
The first callsite is the following code-snippet, which has a suspicious locking because manager->lock gets unlocked and then a few lines later locked again within _peerDisconnected.
The second callsite is a threadCleanup-function of _peerThreadRoutine, as shown in the following code-snippet:
static void *_peerThreadRoutine(void *arg)
{
BRBitcoinPeerContext *ctx = arg;
pthread_cleanup_push(ctx->threadCleanup, ctx->info);
// Do stuff, attempt to establish socket...
if (ctx->disconnected) ctx->disconnected(ctx->info, error);
pthread_cleanup_pop(1); // -> seems to invoke _peerDisconnected via the function-pointer ctx->threadCleanup
return NULL;
}
Now my suspection is that those callsites are racing against each other.
In particular, I observed that _peerDisconnected got invoked with a garbage-info-struct that led to segmentation-faults within btcPeerFree.
Can you confirm that there are troubles with the multithreaded cleanup of BRBitcoinPeers?
Dear Blockset maintainers,
I was playing around with the great functionality of walletkit, but then I stumbled upon a strange crash. The crash happened especially often when trying to connect to a BitcoinPeer in iPhone-airplane-mode, but it also sometimes happened when my iPhone was not in airplane-mode. My investigations made it clear that I was dealing with a race-condition and a potential use-after-free.
Therefore, this PR fixes a suspected race-condition when a BRBitcoinPeer disconnects within a very short time. I don't have a minimum reproducible sample, but I will try to explain how I discovered this crash:
It starts with connecting to BitCoin-peers in peer-2-peer-only-mode, when we reach
btcPeerConnect
withinBRBitcoinPeer.c
.btcPeerConnect
spawns a new thread that executes_peerThreadRoutine
._peerThreadRoutine
tries to establish a socket, when it fails to establish a socket, then it will eventually call_peerDisconnected
and exit the previously created thread.An "info-struct" gets passed to
_peerDisconnected
._peerDisconnected
frees members of the info-struct and does some cleanup-stuff. However, I suspect that_peerDisconnected
might be called twice for the same info-struct. Let me show you the two callsites of_peerDisconnected
. The first callsite is the following code-snippet, which has a suspicious locking becausemanager->lock
gets unlocked and then a few lines later locked again within_peerDisconnected
.The second callsite is a threadCleanup-function of
_peerThreadRoutine
, as shown in the following code-snippet:Now my suspection is that those callsites are racing against each other. In particular, I observed that
_peerDisconnected
got invoked with a garbage-info-struct that led to segmentation-faults withinbtcPeerFree
.Can you confirm that there are troubles with the multithreaded cleanup of BRBitcoinPeers?