Add a call to wait for the node to be fully functional.

thomaseizinger commented 5 years ago

FYI: We have a bitcoin-rpc-client that can already do this. Maybe you can reuse it :)

https://github.com/coblox/bitcoinrpc-rust-client/blob/760b3c89200945dc11e475f73ad400eb67d48263/src/bitcoincore.rs#L351-L366

The problem is that you cannot determine with a single call, whether the node is fully functional because different parts of the node are ready at different times. That is why we ended up retrying a each call for a given number of times until the specific error code (-32) is no longer returned.

stevenroose commented 5 years ago

That's useful. I think we could do something similar.

stevenroose commented 5 years ago

There's two approaches here:

like @thomaseizinger's lib: handle the "not ready" error code in every call
have a wait_until_ready(&self) method for this purpose and an Error::NotReady case to return in case it's not yet ready

I think personally the latter gives more freedom to the users. But it's a bit more cumbersome as it might require a few lines of extra boilerplate.

dpc commented 5 years ago

There's two approaches here:
* like @thomaseizinger's lib: handle the "not ready" error code in every call

* have a `wait_until_ready(&self)` method for this purpose and an `Error::NotReady` case to return in case it's not yet ready
I think personally the latter gives more freedom to the users. But it's a bit more cumbersome as it might require a few lines of extra boilerplate.

I think the wait_until_ready is much better.

In my experience node quite often will respond with just empty string, example when it's under too heavy load. Or the node can be temporarily turned off for maintenance/upgrade. Which in practice means that whatever system is build on top of the node JsonRpc API, should handle all sort of failures in a higher level layer.

So that wait_until_ready is just advisory, juts to help eg. testing suites and similar easily block on starting conditions, and there's no point trying to handle failures in each call, as the practical software should just handle rpc failures on higher level.

thomaseizinger commented 5 years ago

I think the wait_until_ready is much better.

I have to disagree.

If you take a closer look on how we implemented it, you can see that it doesn't actually introduce a performance penalty. Immediately successful calls will return the result just as if there was no error handling.

We initially tried to go down the wait_until_ready route but figured out that it doesn't work very well. The reason is that just because one specific RPC call works, doesn't mean all will work.

getnewaddress might return you a new address but getbestblockhash might return you error -32, which means rpc not ready yet. I am not familiar with how the RPC calls are implemented in bitcoind but I suspect that it depends on different subsystems whether or not you can call a specific RPC method.

That is why we ended up handling only the -32 errors directly for each call and retrying for a (configurable) number of times. All other errors are passed through to the caller and even the -32 one is passed on if the number of retries was not enough.

The result is, if you take the "callers should handle errors" (which I totally agree with) to the "next level", you end up at: what will be the error handling strategy employed by the caller? If it is not yet ready, chances are you will want to try again. Luckily, the RPC interface of bitcoind exposes this error case explicitly, which allows to solve this problem on a generic/framework/library level. (Similar to redirects in an HTTP client.)

Just letting it bounce up to the caller only has downsides IMO. If you don't like it for production, just turn it off through the configuration and during testing, it just works :)

dpc commented 5 years ago

it doesn't actually introduce a performance penalty.

It has nothing to do with performance.

The result is, if you take the "callers should handle errors" (which I totally agree with) to the "next level", you end up at: what will be the error handling strategy employed by the caller? If it is not yet ready, chances are you will want to try again.

The error handling strategy is whatever is necessary. Chances are that it is not going to be retrying, and then what? My thread might have something better to do, or I eg. I might be doing software implemented round-robin over a pool of redundant bitcoin core nodes.

The rule of thumb for software composability is: it's easier to add the missing bits, than remove the unnecessary ones.

With retyring, now you have to document which calls are going to be retried, and which not. Under which conditions exactly. How much retries, how long is the delay, is there exponential backoff implemented and so on. Users might overlook it, or get something wrong, or it might just not be what they need.

If anyone wants retrying, they can always write a simple adapter that adds retrying on top of this library with no downsides.

thomaseizinger commented 5 years ago

I'd argue that it is a very common problem and that a library which sort of claims to be the "official" bitcoin-rpc client for the Rust community should provide a solution for it.

The rule of thumb for software composability is: it's easier to add the missing bits, than remove the unnecessary ones. If anyone wants retrying, they can always write a simple adapter that adds retrying on top of this library with no downsides.

I agree with both statements. Maybe the conclusion is that this library should just provide such an adapter?

At the moment, the implementation doesn't expose a trait for the possible RPC calls. If we would introduce one, providing an adapter should be straight forward and users could compose things together as they need it.

dpc commented 5 years ago

I am all :+1: for an adapter in the library itself.

stevenroose commented 5 years ago

Hmm, one way I see this adaptor be possible is to define a trait with the RPC methods and with call. All methods could have default implementations using call and call has to be implemented by the implementer.

Then Client can implement this trait by fulfilling the call method. Another AutoRetryClient or something can either (1) internally keep a Client, use it's call method and re-call it when it receives the "waiting for startup" return code, or (2) just reimplement call and do error handling there.

This would require the user to import both the trait and the Client (or alternative) type in order to use the lib.

Thought?

dpc commented 5 years ago

There's nothing that is 100% needed. Anyone can just add a struct wrapping the Client internally and adding retries, and submit a PR (or not, and keep it application-specific, along with the details of retry implementation).

There could be a trait for interoperability, but I am not sure if anyone really needs it. Just because RetryingClient and standard Client do have same calls, doesn't mean it actually make sense to have them generalized over and swappable. After all they do behave differently, so if your system expects internal retries, you're not going to swap to Rpc impl that does not retry or the other way around.

In my code I usually define my own trait for stuff like Rpc anyway, as I am going to use only a subset of all Rpc calls. And at this layer, I could optionally, generalize over RetryingClient and Client if I wanted. Though again, I don't think it is useful as retrying vs not retrying is a big part of API. I usually use this trait for testing, so I can swap the IO with a fake one to drive the logic for testing. But it could also be used for supporting other cryptocurrency nodes etc.

thomaseizinger commented 5 years ago

There could be a trait for interoperability, but I am not sure if anyone really needs it. Just because RetryingClient and standard Client do have same calls, doesn't mean it actually make sense to have them generalized over and swappable. After all they do behave differently, so if your system expects internal retries, you're not going to swap to Rpc impl that does not retry or the other way around.

I think the value would be more in what you mentioned in the 2nd paragraph: Using the trait to create test doubles. If you have got your own trait, it is unnecessary but it is cumbersome if you have to define one even though you wouldn't need it otherwise.

All methods could have default implementations using call and call has to be implemented by the implementer.

I like this approach.

There's nothing that is 100% needed.

We need it :D We haven't started the migration from our client to this one yet, but once that is the case, we need this functionality. Having a trait where we only need to implement the call method would make that very easy :)

This would require the user to import both the trait and the Client (or alternative) type in order to use the lib.

I think most crates in the ecosystem solve problems like these with a custom prelude module.

dpc commented 5 years ago

If you have got your own trait, it is unnecessary but it is cumbersome if you have to define one even though you wouldn't need it otherwise.

You should define the smallest possible interface you need and not throw everything and kitchen-sink into one common trait. rust-bitcoincore is not in any sense an interoperability library. It is just a simple wrapper over all possible BitcoinCore JsonRPC. It is neither general enough, nor minimal to have any useful trait.

There is like 0% chance that whatever you're building is going to use all these call. So why would you want to use such trait as API for testing? Just to confuse yourself and everyone in which calls really have to be mocked?

If your app/component/whatever only calls get_block and get_transaction, than you want to create a trait with only these two methods. These serves as both code-as documentation, and architectural constraint, reminding you every time when you modify the dependency on the node interface. And it saves you time if you have to worry only about small subset of all possible Rpc calls.

thomaseizinger commented 5 years ago

You should define the smallest possible interface you need and not throw everything and kitchen-sink into one common trait.

That is the ideal scenario, I agree but I would say that going for the ideal scenario is not always worth it.

So why would you want to use such trait as API for testing?

To make my life simpler and having to care about less stuff like needing to define my own trait before I can do any kind of mocking.

In my opinion a library should not constrain its users in the way it can be used as long as the result is still correct. It should be open to as many usecases as possible and I'd say it is quite common to represent an interface to any kind of thing in an adapter library with whatever abstract construct the language provides, like traits or interfaces.

What would be the downside of including such a trait in the this library?

dpc commented 5 years ago

That is the ideal scenario, I agree but I would say that going for the ideal scenario is not always worth it.

What's not worth it is maintaining ill-conceived traits in common libraries. It's just another API surface to confuse users and additional chore for the maintainers. All in the name of some hypothetical convenience for hypothetical (and lazy) user that can't do something right for their own good.

First, the idea of retrying inside this library has bunch of problems, and just enables people to write buggy software that does not have a reliable handling of errors. Because ill-conceived convenience.

Then, a trait is consider, which has problems, and just enables people (hypotetically) to do their abstractions wrong. Again, because ill-conceived convenience.

Then maintainers will get tired of fixing typos and documentation in 3 places, so someone will add procedural macros "to fix it". Because ill-conceived convenience.

From a small library with tiny and well defined purpose, we land with over-complicated monster, that is hard to maintain, review and confusing for users.

I really have no time for this. You're free to implement it and whatever other ideas you'd like in a library wrapping this library, or other owners of this crate can go along with it if so they like. As long as I don't have to do anything, it's fine with me. :)

Now, to stick to the subject of the issue: what would be useful is to have a call that checks if the node is ready in a reliable way, so users don't have to google around to figure out how to do it.

stevenroose commented 5 years ago

Take a look here guys: https://github.com/rust-bitcoin/rust-bitcoincore-rpc/pull/15

stevenroose commented 5 years ago

Something like this should be sufficient to have an auto-retry client: https://github.com/stevenroose/rust-bitcoincore-rpc/blob/example-retry/client/examples/retry_client.rs

D4nte commented 5 years ago

Now, to stick to the subject of the issue: what would be useful is to have a call that checks if the node is ready in a reliable way, so users don't have to google around to figure out how to do it.

Sounds great!

stevenroose commented 5 years ago

@d4nte did you take a look at my proposal?

rust-bitcoin / rust-bitcoincore-rpc

Add a call to wait for the node to be fully functional. #8