Closed rauljordan closed 6 years ago
Great read! it definitely opened my eyes. We should turn your write up into a Medium article to benefit broader audiences.
Take Notary as an example, do we think the following is the right path?
makeNotaryNode()
gets notary config by calling makeConfigNotaryNode()
and registers notary services by calling registerNotaryService()
. Then makeNotaryNode()
starts notary node via startNotaryNode()
Within startNotaryNode()
we iterates each notary services and start()
. Notary should implement the Service interface under service.go
. While creation of the notaryEthereum
service instance should kick off go routines such as downloader, fetcher.. etc
I'm not a fan of the complicated config setups that are spread across multiple files, but I'm a fan of having a bunch of services that implement a certain interface be attached to our notary client with each of them having a .Start()
func. I don't think we should copy exactly what they did, but instead trim it down as much as possible and keep the good parts that leverage concurrency.
Hey all,
So thinking more about this as I transition into creating a PR for #109, here is a proposal I would like to make for our sharding clients moving forward. As our code base grows, it's important to think about how we can best leverage concurrency, event management, and simple configuration options that don't cause any headaches to those reading our work. There are elements of the light node design that I'd like to incorporate into our system for spinning up notary/proposer clients. Here are the ideas:
The key idea is that our sharding entry point will spin up a ShardingClient
struct, which is analogous to geth spinning up an instance of a Node
.
ClientType
that specifies if the client will be a Notary
or Proposer
insteadShardingClient
instance
ShardingService
: either Notary
or Proposer
Notary
and Proposer
instances implement a ShardingService
interface that defines common methods to both, including, but not limited to: type Service interface {
// Protocols retrieves the P2P protocols the service wishes to start.
Protocols() []p2p.Protocol
// APIs retrieves the list of RPC descriptors the service provides
APIs() []rpc.API
// Start is called after all services have been constructed and the networking
// layer was also initialized to spawn any goroutines required by the service.
Start(server *p2p.Server) error
// Stop terminates all goroutines belonging to the service, blocking until they
// are all terminated.
Stop() error
}
The idea of attaching services this way to the sharding client allows service life-cycle management to be the responsibility of the sharding client itself. Moreover, every single goroutine pertaining to a service can be spun up and contained within its .Start()
method.
.Start()
function will open a local shardchaindb file storage, and spin up notaries and proposers' respective p2p ServerPools
and ProtocolManager
's .Start()
methods
ServerPool
kickstarts an event loop that handles peer discovery, new connections, and disconnections from peersProtocolManager
is struct that handles notaries and proposers' respective event loops (i.e. interacting with the SMC, the voting process, etc.), their corresponding serverPools, their chaindb, txpools, and message requests/responses from other peers.A ProtocolManager
interface allows for a well-defined set of responsibilities and goroutines executed by notaries and proposers.
The lifecycle of notaries and proposers in the p2p network can be handled via a callback as in the les
package that deals with the handshake between peers, and an eternal loop of responding to incoming messages via the ProtocolManager
's handleMsg
functionality.
Clients React to Each Other Via the ProtocoManager
's handleMsg
Function
There is a fixed set of messages sharding clients can respond to and send. We can follow the same approach as done in the les
package's ProtocolManager.handleMsg
function to do this.
Overall, I suggest we keep configurations in a single place, without many dependencies across files, and we document everything extensively. Let me know your thoughts.
@prestonvanloon @terenc3t @nisdas @enriquefynn @Magicking
Looks good. In regard to how clients interact with each other, check out this LES flow control writeup: https://github.com/zsfelfoldi/go-ethereum/wiki/Client-Side-Flow-Control-model-for-the-LES-protocol
Raul, do you know of any other similar docs that could help a new developer peek inside the geth architecture& design? Great job on this one, though! It really helped me get a broad picture of the geth design (not limited to the light client)
Hi all,
As more of our sharding client code is being created in our fork, it is critical to understand the design considerations of the current Ethereum nodes baked into go-ethereum. In particular, our notary/proposer clients need to be designed with good event loop management, pluggable services, and solid entry points for p2p functionality built in. As a case study, we will be looking at lightsync nodes as they are currently implemented in geth, understand their full responsibilities, and figure out the bigger picture behind the design considerations of their architecture.
The key question we will be asking ourselves is: what exactly happens when we start a light client? What are the design considerations that came into play when designing the code that gets the light client to work?
We will cap off this document by determining what aspects of the protocols in geth we can use as part of our sharding clients. We have an opportunity to write clean, straightforward code that does not have a massive number of file dependencies and complicated configs as geth currently does.
Let’s dive in.
Case Study: Light Client Nodes
Ethereum’s light client sync mode allows users to spin up a geth node that only downloads block headers and relies on merkle proofs to verify specific parts of the state tree as needed. Light peers are extremely commonplace and critical components in the Ethereum network today. Their architecture serves as a great starting point for anyone extending or redesigning geth in a secure, concurrent, and performant way.
Unfortunately, the current geth code is very hard to read, has a ton of dependencies across packages, and contains obscure configuration options. This doc will attempt to explain light client sync from start to finish, light node peer-to-peer networking, and other responsibilities of the protocol.
How is a Light Node Triggered?
Launching a geth light node is as easy as:
Upon the command being executed, the main function within
go-ethereum/cmd/geth/main.go
runs as follows:This triggers the urfave/cli external package’s Run function, which will trigger the geth function a few lines below main().
Based on the cli context, this function initializes a
node
instance, which is a critical entry point. Let’s take a look at howmakeFullNode
does this.In
go-ethereum/cmd/geth/config.go
:Two important functions are at play here:
makeConfigNode
returns a configuration object that uses the cli context to fetch relevant command line flags and returns a node instance + a configuration object instance.utils.RegisterEthService
is a function that, based on the command line flags from the context, will use configuration options to add aService
object to the node instance we just declared above. In this case, the cli context contains the--syncmode="light"
flag that we will be using to setup a light client protocol instead of a full Ethereum node.Let's see
makeConfigNode
ingo-ethereum/cmd/geth/config.go
:Cool, so this function just sets up some basic, default configurations to start a node. This sets up some basic, familiar options we have in the Ethereum network.
The
utils.SetEthConfig(ctx, stack, &cfg.Eth)
line is what will modify thecfg
option based on command line flags. In this case, ifSyncMode
is set tolight
, then the config is updated to reflect that flag. Then, we go into the actual code that initializes a Light Protocol instance and registers it as the node's ETH service.In
go-ethereum/cmd/flags.go
:So here, if the config option for the downloader is set to
LightSync
, which was set in themakeConfigNode
function we saw before, we register aService
object into the node (referred to as stack in the code above). Nodes contain an array ofService
instances that all implement useful functions we will come back to later. In this case, the service aLightEthereum
instance that gives us all the functionality we need to run a light client.How Do These Attached Services Start Running?
Here's where everything actually ties together. If you go back to the
main
function ingo-ethereum/cmd/geth/main.go
,the
startNode
func actually kicks things off.When we look at
utils.StartNode
ingo-ethereum/cmd/utils/cmd.go
:...we see the actual code that starts off a node! Let's explore. In
go-ethereum/node/node.go
, a lot of things happen (simplified for readability):Aha! So this is the function that iterates over each attached service and runs the
.Start()
function for each! TheLightEthereum
instance that was attached as a service to the node implements theService
interface that contains a.Start()
function. This is how it all fits together!The Light Ethereum Package
We will focusing our attention on the
go-ethereum/les
package in this section, as this is the service that is attached to the running node upon launching a geth instance with the--syncmode="light"
flag.The light client needs to implement the
Service
interface defined ingo-ethereum/node/service.go
as follows:The core of the entire light client is written in
go-ethereum/les/backend.go
. This is where we find the functions required to satisfy thisService
interface, alongside the code that initializes an actualLightEthereum
instance in a function known calledNew
.Let's see what the light client's
.Start()
function does and how it sets up the p2p stack:Light Protocol Event Loop
The creation of the
LightEthereum
instance kicks off a bunch of goroutines, but where the actual sync and retrieval of state occurs is in the creation of aProtocolManager
in theNew
function.In
go-ethereum/les/handler.go
, we see at the bottom of theNewProtocolManager
function, code that runs some event loops:In this case, we the instance starts a new
downloader
instance and anewLightFetcher
, which work in tandem with the p2p layer to sync the state and respond to RPC requests that trigger events on peers or respond to incoming messages from peers.The implementation diverges into a variety of files at this point, but an important aspect of the
les
package is the usage of on-demand requests or ODR's. Through the p2p light server, nodes receive requests that are processed via goroutines such as in the example below.In
go-ethereum/les/odr_requests.go
:The node in question has the capacity to immediately respond to a message received via other peers, which is a critical piece of functionality we will need the more we elaborate on our notary/proposer clients.
Key Takeaways
Overall, taking full advantage of Go's concurrency primitives along with mutexes for managing services is a great benefit of working with the geth client. We should maintain the pluggability of
Services
via aService
-like interface and allow for easy management and testing of relevant code.What we should avoid, however, is the extremely dependent spaghetti code around configuration options. There is a lot of hetereogeneity around configuring structs in the geth client, with packages often following their own approaches compared to others throughout the project. We should aim to constrain all configuration to a single, initial entrypoint and avoid redundancy of
.Start()
methods. After reading this code, it often feels like the geth team really drove themselves into a corner here. We have the opportunity to keep things simple, DRY, and performant.We have to leverage the powerful constructs shown above in our notary/proposer implementations to make the most out of Go. Please let me know your thoughts below as to how we can improve upon what the
go-ethereum
team has done.Let's go for it.