Open 0xarvin opened 2 years ago
I don't think this was a known issue, and I'm guessing this may be related to the specific server your are running because I've never seen this happen with electrs.
Looking at the code it's also very weird that this is happening: we do have code to re-order the replies from the server for batch requests:
Either there's an issue there, or the server is replying with the wrong request id. Can you try running your tests with trace
log level for this crate and then attaching the log here? From that log we should be able to see if the server replies are correct.
That's what I'm starting to think, but here's the trace. I replaced the electrum server but let me know if I should include that.
https://gist.github.com/0xarvin/9baee245a45e6e409d9be65e4a447101
Which electrum server software are you using where you see this issue?
@notmandatory We spun up a set of signet nodes:
url: ssl://electrum.nodes.wallet.build:51002 electrum server: fulcrum
+1 for this issue. Happens for many servers, not just one special guy. Most servers I tested exhibit this bug. Including:
ssl://fulcrum.grey.pw:51002
ssl://electrum.jochen-hoenicke.de:50099
tcp://v7gtzf7nua6hdmb2wtqaqioqmesdb4xrlly4zwr7bvayxv2bpg665pqd.onion:50001
tcp://venmrle3xuwkgkd42wg7f735l6cghst3sdfa3w3ryib2rochfhld6lid.onion:50001
ssl://blackie.c3-soft.com:57002
ssl://e.keff.org:50002
tcp://udfpzbte2hommnvag5f3qlouqkhvp3xybhlus2yvfeqdwlhjroe4bbyd.onion:60001
tcp://kittycp2gatrqhlwpmbczk5rblw62enrpo2rzwtkfrrr27hq435d4vid.onion:50001
ssl://horsey.cryptocowboys.net:50002
I can confirm the cause does NOT originate within this library.
The culprit electrum servers are not respecting JSON-RPC request IDs correctly. If i ask for a series of transactions with a given set of request IDs, the server will return responses for those request IDs but the actual response data might not be mapped to the correct response IDs.
E.g. request with ID 9
asks for TX abcd...
, but the response with ID 9
instead contains TX beef...
which was requested elsewhere in the batch under a different request ID.
I imagine this is a bug somewhere in the upstream servers' async request handling code. We could easily fix this inside this library by sorting the response TXs by their true TXID. However, if the servers aren't respecting JSON-RPC request IDs, it's hard to know what other properties of JSON-RPC they might be breaking. I didn't test any other batch methods to see if they suffered from the same problem as batch_transaction_get
. I think more testing is required here.
The fact that so many servers are apparently broken in this respect is concerning and suggests that maybe a client-side hack is needed, otherwise callers will inevitably reopen issues like this in the future.
I can confirm that this problem is not limited to batched transaction fetching. It also affects batch_script_get_history
. If I had to guess, i suspect afflicted servers just don't respect JSON-RPC request IDs for any methods.
Hi!
I also stumbled about this problem and after some tests I think I can confirm that this is no server issue but must happen somewhere on the Rust side of things. Unfortunately my knowledge of Rust is still too limited to spot the error. 🙈
Here is what I tested:
I wrote a simple test program, to reproduce this error. It runs in a loop and every now and then the results are out of order:
extern crate electrum_client;
use bitcoin::block::Header;
use electrum_client::{Client, ElectrumApi};
fn main() {
loop {
let client = Client::new("tcp://bitcoin.aranguren.org:50001").unwrap();
// random blockheights
let heights: Vec<u32> = vec![1, 4, 8, 12, 222, 6666];
let headers: Vec<Header> = client.batch_block_header(heights).unwrap();
let mut err_counter = 0;
if headers.get(0).unwrap().time != 1231469665 { // time of block 1
err_counter += 1;
println!("error 0");
}
if headers.get(1).unwrap().time != 1231470988 { // time of block 4
err_counter += 1;
println!("error 1");
}
if headers.get(2).unwrap().time != 1231472743 { // time of block 8
err_counter += 1;
println!("error 2");
}
if headers.get(3).unwrap().time != 1231474888 { // time of block 12
err_counter += 1;
println!("error 3");
}
if headers.get(4).unwrap().time != 1231770653 { // time of block 222
err_counter += 1;
println!("error 4");
}
if headers.get(5).unwrap().time != 1236456633 { // time of block 6666
err_counter += 1;
println!("error 5");
}
println!("Result: {} headers were not at expected index in result vector", err_counter);
println!("==========================================")
}
}
I also added some debug println
here in this block: https://github.com/bitcoindevkit/rust-electrum-client/blob/0b97659b383554357fa55df1e9770e61b1d84ba4/src/raw_client.rs#L66-L68
for x in resp {
println!("************************");
println!("{:#?}", x);
answer.push(serde_json::from_value(x)?);
}
Additonally I started wireshark to monitor the traffic to the Electrum server (that's why I chose a clear text server).
Finally I set the breakpoint at the end of my loop, so that I could step through my loop and identify the corresponding TCP connection in my wireshark window. As soon a saw the error in the console output I looked into wireshark and displayed the last TCP connection with "Follow TCP stream".
And here is the result:
I cannot explain why, but we definitely can see that the server returned the responses with their correct corresponding JSON rpc id's but here in this line in resp
it ended up in a different order: https://github.com/bitcoindevkit/rust-electrum-client/blob/0b97659b383554357fa55df1e9770e61b1d84ba4/src/raw_client.rs#L63
Any ideas @afilini ? 🙂 (my rust knowledge is still too limited to spot the issue here)
As a short sidenote: This bug also impacts the BDK in some way and leads to incorrect (or at least inconsistent) data being synced into the wallet database.
In detail:
During sync the .zip(...)
call here in this line also assumes that the results are in the same order as the requested block heights.
If they are not this will lead to a situation where the result of wallet.list_transactions(...)
will contain transactions where the BlockTime
struct confirmation_time
will contain inconsistent data, where the height
does not match the timestamp
(containing timestamp of a different block).
pub struct BlockTime {
/// confirmation block height
pub height: u32,
/// confirmation block timestamp
pub timestamp: u64,
}
@johnzweng can you try to reproduce this with the latest 1.0.0-beta.2 release? The syncing has seen major changes and if still broken we'll likely fix for the 1.0 version first.
Via BDK, we're pulling transactions for a Wallet through a signet electrum instance. Recently, we've noticed that the order that we call batch_transaction_get_raw isn't being respected by electrum and results in the transactions being returned by the client to also be out of order as the reader takes the top result if it's unable to grab a hold of the lock on the reader
This doesn't affect production calls in BDK but only in tests as there's a debug assert here. Was wondering if there's anything that I'm missing or if this is a known issue.
Wallet Descriptor for example:
wsh(sortedmulti(2,[e5cd860e/84'/1'/0']tpubDFVZtuMvX7n6DCWbNAgq361KiJdVFBy4D1sZmdafF1zDNW6dv79MwbK6JBRxGV42BS4STayPsG7Prn3QT7LQwzfaBhE8YhgCs4VzhkYVWTg/0/*,[73ee041b/84'/1'/0']tpubDEPCQp4KHW3sS5o18QbknuC6dtQJPFgWtS6e8wm6ewnc2B1vucCnKf9uf9gCddHeJoK5d5hgJv5ji5Q8VnpT4h9ydsi9FE4t8F2GoPrFPYa/0/*,[08a73043/84'/1'/0']tpubDEZ565bfePfqSVkSfrLaR7g5nruLDWxJFKkJvKWUpMEQBsS9p2f5um5uTwgmKWuHbyoiythmEvDRSgdEoYBgLXfosBsjMBTp1jev9wfi2q9/0/*
Network:
Signet