lightninglabs / neutrino

Privacy-Preserving Bitcoin Light Client
MIT License
900 stars 183 forks source link

couldn't retrieve block from network error #248

Closed LiranCohen closed 2 years ago

LiranCohen commented 2 years ago

I've been using the neutrino library for a few weeks now to retrieve blocks from the bitcoin network and do some indexing.

Yesterday I started getting an error "couldn't retrieve block from network" Nothing really changed with my set up so I'm curious what are the potential causes for this error?

Was trying to follow the code but am not sure what would cause this: https://github.com/lightninglabs/neutrino/blob/86e8ff67a0a0916cd1e0ece425836e0778d93c96/query.go#L1019

Does this happen when none of the connected peers have the block available?

Just trying to understand a little better what could cause the foundBlock to be nil and return this type of error.

It's happening at a different block each time I try, and usually well into my indexing process.

Any sort of guidance would be appreciated.

guggero commented 2 years ago

Could be that you don't have a valid connection to a peer anymore? Sometimes the full nodes block you if their connection slots are all occupied. Then you need to connect to a different peer, which might not always happen automatically, depending on your initial peers. So I guess the question is: do you see any other activity with peers? Can you retrieve filter headers, just not blocks?

LiranCohen commented 2 years ago

@guggero Thanks for the quick response. I'm not really sure as I'm currently issuing a panic on that error, but I'm working as we speak to refactor those bits and see if I can get any of that info.

When I first saw this issue happen I wasn't issuing a panic and just silently eating up the error, I'm pretty sure it was able to continue crawling and fetching other blocks, but let me do some refactoring so that I can i guess figure out how many peers I'm connected to and if I can fetch any other data at that point.

For debugging purposes I'll try adding logic to print out the ConnectedCount() and maybe try to fetch some other data and see if that also errors out when I get this specific error.

LiranCohen commented 2 years ago

Thanks for your help @guggero ... I do think that maybe the peer blocked me or something of that nature. I wonder if this error could be more explicit. It's hard to tell exactly what is going on.

Seems like the connection count went down by a few peers between the start and an erroring call, I might put in some logs to get a count before and after the call (not sure if that will give me more info). Does Neutrino automatically find more peers if connection count is below a threshold of sorts?

I think I need to spend some time going through how Neutrino manages peers to better understand some of this.

I put in some retry logic to await a 200ms timeout and retry several times, but it seems that it always gets the block on the first retry.

TBH Neutrino isn't best suited for what I'm doing right now, but it's a really good implementation of the btcd wire protocol that gives me most of the functionality I need.

I'm guessing we can close this issue. Thanks again.

guggero commented 2 years ago

Does Neutrino automatically find more peers if connection count is below a threshold of sorts?

I'm not sure how exactly it's implemented, I'm not too familiar with the code base. But I assume it tries to always have at least one connection. There are probably a few parts of the code that could benefit from some improvements. So if you find things to fix along the way, even if it's just logging, please don't hesitate to open a PR.

I'm closing the issue for now, feel free to re-open if you think the error itself should be changed.