livepeer / go-livepeer

Official Go implementation of the Livepeer protocol
http://livepeer.org
MIT License
546 stars 169 forks source link

Enable O to configure failover -ethUrl endpoint #1959

Open chrishobcroft opened 3 years ago

chrishobcroft commented 3 years ago

Problem Statement

When an Orchestrator loses connection to its -ethUrl, it drops all its streams.

Such a disconnection will occur when the configured -ethUrl RPC service becomes unavailable.

An -ethUrl RPC service will become unavailable when it is stopped, or restarted, perhaps due to a software update.

Proposed Solution

As an Orchestrator, I want to be able to configure a failover option for the -ethUrl, such that if the primary -ethUrl becomes unavailable, then the Orchestrator service can failover to secondary (or tertiary etc.) Ethereum RPC endpoint.

Alternative Solution

Alternatively, we can maintain the status quo, and accept that streams will be interrupted whenever an Orchestrator's Ethereum RPC endpoint becomes unavailable. This may become more frequent as we approach the merge of eth1 + eth2 = Ethereum, and may not be desirable.

Additional Context

See below for chart showing Orchestrator Sessions during a time window when the service providing the Orchestrator's -ethUrl was restarted, in order to upgrade to a new version of geth, for the London Hardfork:

Screenshot_20210719-205056_DuckDuckGo

chrishobcroft commented 3 years ago

It may also be worth noting, that Prysmatic Labs' beacon-chain client implements such failover functionality in golang.

From the -help:

--fallback-web3provider value       A mainchain web3 provider string http endpoint. This is our fallback web3 provider, this flag may be used multiple times.

Codebase is here: https://github.com/prysmaticlabs/prysm/

yondonfu commented 3 years ago

Thanks for the suggestion!

An alternative that is possible to setup today without any changes to livepeer is to set -ethUrl to the URL of a load balancer service for that can route requests to one or many Ethereum nodes (or even hosted providers like Infura/Alchemy). I'm only familiar with setting this up in a k8s environment, but one could also set this up in a bare metal or cloud environment. A quick Google yields the following results that might be helpful - a failover proxy for Chainlink and a Medium article on setting up a HA Ethereum node cluster on AWS.

That being said, I can see why the support of a fallback in livepeer can be useful to minimize additional devops work that a node operator would need to do to ensure a high availability connection to Ethereum nodes. Given the availability of an alternative (as mentioned above), I don't think support for a fallback will be prioritized soon, but I do think it can be considered.

chrishobcroft commented 3 years ago

Thank you for the suggestions @yondonfu I will look into that.

And I get that everyone has their own priorities. But perhaps in general, we could be careful not to dampen any potential enthusiasm which may ever come from the many permissionless outsiders, by labeling things as not going to be prioritised soon, by the few permissioned insiders. What do you think?

yondonfu commented 3 years ago

But perhaps in general, we could be careful not to dampen any potential enthusiasm which may ever come from the many permissionless outsiders, by labeling things as not going to be prioritised soon, by the few permissioned insiders. What do you think?

Sure. To clarify on my previous comment - I don't think support for a fallback in livepeer will be prioritized by the Livepeer Inc. team soon, but I do think it can be considered and anyone else is welcome to jump in here to continue the conversation about this feature request.

leszko commented 2 years ago

I had a quick look at this issue and here are a few thoughts:

  1. From the UX perspective, I think we should allow having multiple ETH URLs in the -ethUrl parameter, e.g. -ethUrl https://url-1,https://url-2
  2. I looked at the code and I'd estimate the work to ~5 days
  3. I did a small PoC here, which introduces the failover for the BlockWatcher part. Other parts that need to get addressed are: gas price monitor (trivial), livepeer eth client (the most difficult part, because the client is boiled in the contracts)
AuthorityNull commented 2 years ago

I had a quick look at this issue and here are a few thoughts:

  1. From the UX perspective, I think we should allow having multiple ETH URLs in the -ethUrl parameter, e.g. -ethUrl https://url-1,https://url-2
  2. I looked at the code and I'd estimate the work to ~5 days
  3. I did a small PoC here, which introduces the failover for the BlockWatcher part. Other parts that need to get addressed are: gas price monitor (trivial), livepeer eth client (the most difficult part, because the client is boiled in the contracts)

There’s still a lot of need for this so it’s good to know there’s still hope!

0xcadams commented 2 years ago

https://docs.ethers.io/v5/api/providers/#providers-getDefaultProvider This could be a good inspiration for the application-level fallbacks which exist in ethers - the "quorum" may be overkill, but it's easily configurable with Infura IDs and a lot of sane defaults.

https://github.com/ethers-io/ethers.js/blob/master/packages/providers/src.ts/infura-provider.ts