Closed shesek closed 1 month ago
Overall this MR seems to greatly improve the initial sync and steady state of electrs alongside a slower bitcoind.
some issues found during testing
blockchain.transaction.get
will panic on the handling thread if the transaction is not known by electrs (panic starts at https://github.com/shesek/electrs/blob/202405-mempool-sync/src/electrum/server.rs#L375 rpcthreads
parameter in bitcoind does not seem to change this. Electrs itself seems to recover. These recv disconnects happen with different rpcs (usually getrawtransaction
, but also getbestblockhash
)* some requests from electrs to bitcoind will disconnect waiting for recv, at https://github.com/shesek/electrs/blob/202405-mempool-sync/src/daemon.rs#L210 Increasing the `rpcthreads` parameter in bitcoind does not seem to change this. Electrs itself seems to recover. These recv disconnects happen with different rpcs (usually `getrawtransaction`, but also `getbestblockhash`)
While tracing this with wireshark, I observed that electrs sends a request (such as getbestblockhash), and waits for a response. After 30 seconds (and a few TCP Keep-Alive interactions), bitcoind sends a FIN,ACK without any text response. Seems to be some behaviour in bitcoind.
the electrum method
blockchain.transaction.get
will panic on the handling thread if the transaction is not known by electrs
I was unable to reproduce this, the "missing transaction" message appears to be returned as a JSONRPC error without panicking:
# Client
$ echo '{"id": 1, "method": "blockchain.transaction.get", "params": ["0000000000000000000000000000000000000000000000000000000000000000"]}' | nc 172.17.0.3 60401
{"error":"missing transaction","id":1,"jsonrpc":"2.0"}
# Server logs
TRACE RPC Request("{\"id\": 1, \"method\": \"blockchain.transaction.get\", \"params\": [\"0000000000000000000000000000000000000000000000000000000000000000\"]}\n")
WARN rpc #1 blockchain.transaction.get [String("0000000000000000000000000000000000000000000000000000000000000000")] failed: Error: missing transaction
TRACE RPC Done
# ... keeps running
Seems to be some behaviour in bitcoind.
I'm seeing that behaviour on my local regtest env too.
utACK a6c181e2f12433860a100c4b44b8dc19818f4afb
in analyzing pcap trace between local electrs and local. bitcoin rpc, we see electrs asking for getrawmempool on 3 times in 16 seconds, in a single socket connection. MR needs more analysis/testing.
2 issues
repeated requests for full mempool on one thread (correct?)
disconnects are actually coming from electrs, not bitcoind (screenshot of pcap)
2024-08-16T09:26:16.938-07:00 - INFO NetworkInfo { version: 270000, subversion: "/Satoshi:27.0.0/", relayfee: 1e-5 }
2024-08-16T09:26:16.938-07:00 - INFO BlockchainInfo { chain: "main", blocks: 857053, headers: 857053, bestblockhash: "00000000000000000003066eeec0d73da4cf50ee6bf7e31f9032145c17ad3d54", pruned: false, verificationprogress: 0.9999971, initialblockdownload: Some(false) }
2024-08-16T09:26:34.141-07:00 - INFO hash=00000000000000000003066eeec0d73da4cf50ee6bf7e31f9032145c17ad3d54 height=857053 @ 2024-08-16T16:18:23Z (2 left to index)
2024-08-16T09:27:14.773-07:00 - INFO Electrum RPC server running on 0.0.0.0:50001
2024-08-16T09:27:14.777-07:00 - INFO REST server running on 0.0.0.0:3001
2024-08-16T09:27:19.773-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.264-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.264-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.265-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.265-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.265-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.265-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.265-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.265-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.265-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.265-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.266-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.266-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.266-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.266-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.266-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.266-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.266-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.266-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.267-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:23.267-07:00 - WARN reconnecting to bitcoind: disconnected from daemon while receiving
2024-08-16T09:27:26.691-07:00 - WARN chain tip moved while updating mempool
2024-08-16T09:27:26.691-07:00 - WARN skipped failed mempool update, trying again in 5 seconds
2024-08-16T09:27:31.692-07:00 - INFO hash=000000000000000000008e575ea9a5b9bcd0591f6b749f975fd679fc8ca23e62 height=857054 @ 2024-08-16T16:26:43Z (1 left to index)
following one of the "disconnects":
electrs should be responding to the FIN,ACK and active, and reopen
repeated requests for full mempool on one thread ... we see electrs asking for
getrawmempool
on 3 times in 16 seconds, in a single socket connection
getrawmempool
is used to get the list of mempool txids (not txs), and should be sent from the main (sync loop) thread using the same socket (unless it disconnects). It should be sent at least once every 5 seconds, possibly more if additional roundtrips are needed to get a full mempool view.
The 500 error is used by bitcoin rpc to indicate that the getrawtransaction call requested a non existent tx ... Electrs should consider this case of error by inspecting the details of the json-rpc response, and not wait 5 seconds.
This should already be the case. In fact only errors explicitly constructed as an ErrorKind::Connection
would trigger the automatic reconnection behavior, other errors will bubble up. RPC errors for "tx not found" are identified and ignored, while other unexpected RPC errors terminate execution (without reconnecting).
disconnects are actually coming from electrs, not bitcoind (screenshot of pcap)
Looking into this.
electrs sits on connection for 30 seconds .. bitcoin sends FIN,ACK but electrs does not close
bitcoind appears to close the connection after a period of inactivity, which can be adjusted with the (hidden) config option rpcservertimeout
(defaults to 30 seconds).
Adjusting it upwards made the disconnection issues I was experiencing in my local env go away. I also tested with electrs and bitcoind connected over the internet (local electrs connecting to a regtest bitcoind on a vps) and was able to retain open TCP connections with no disconnections for several hours (used every ~2 minutes, with rpcservertimeout
set to 5 minutes).
I also tried investigating what's causing electrs not to close immediately in reply to bitcoind's FIN, but do not have an answer yet.
Rebased and added some more verbose logging for mempool syncing.
testing latest commit using local electrs and bitcoind, I set parameters in bitcoin.conf
rpcthreads=16
rpcservertimeout=600
rpcworkqueue=50
I have not seen any "disconnected from daemon while receiving" messages.
utACK 7a068bf0e0dc3c063b0f661bc178fd8b4c3e2290
Optimizations:
Send RPC requests in parallel over multiple TCP connections, without batching (38e73d9a2808586e202b16d619645e671817c055)
The number of parallel requests to send can be controlled with a new
--daemon-parallelism
CLI options (defaults to 4).Continuously attempt to fetch a full mempool snapshot, keeping whatever transactions we were able to get each time (6a14f40d61317b11b6b3f0eeae1199d39585dda9)
Fixups (not strictly necessary for the other changes, but touches on related areas and the previous behavior was wrong):
Don't add transactions with unknown parents to ensure mempool consistency (c60207ad6c81f1ec05982babf7941be53ddcdb26)
Make sure the chain tip doesn't move while fetching the mempool (a6a796c69e5ea935ff67434633139f0dd214476e)
This will require adjusting bitcoind's
rpcthreads
and possiblyrpcworkqueue
configuration options upwards. The defaultrpcthreads
is 4, which would be exhausted by mempool fetching alone (electrs
keeps at least one other open connection, and we probably want to allow some room for other clients too).Based on top of #76. (now merged)