prysmaticlabs / prysm

Go implementation of Ethereum proof of stake
https://www.offchainlabs.com/
GNU General Public License v3.0
3.44k stars 979 forks source link

Reusing Geth's Nodes and Interfaces in Our Implementation #143

Closed rauljordan closed 6 years ago

rauljordan commented 6 years ago

Hi all,

As our goal is to eventually merge with upstream, we need to leverage the tools and interfaces Geth uses to spin up nodes instead of rewriting everything from scratch ourselves. As an example, instead of writing our own sharding node, we could try to leverage what currently exists in geth and instead attach a sharding server as a Serviceto a node in a similar way that the les package attaches a Light Ethereum Server.

Currently, there is a lot to work on but it is not too clear where exactly we can add these items in our code base. Being able to extend geth's functionality in a compliant way would make it easier for us to know exactly where to add the shardChainDB, or where to add a p2p discovery service, etc.

Figuring this out would unblock a lot of potential issues and PRs we could assign to the team or leave open for other contributors. This will be the top priority for now and we can use this issue as a thread to share our findings as to how to go about this efficiently.

Requirements

@terenc3t: can you look into what a protocol manager is in the les package and how we can use this? @prestonvanloon: can you take a look at serverpool.go in the les package and see how it attaches useful p2p services to the protocol manager? @nisdas: can you look at how the chainDb, state, and trie are used in the les package and share your findings here?

I will be looking at the main entry points of Geth to see how we could initialize sharding in a similar fashion to how an Ethereum or a Light Ethereum instance are started. In this case we would launch a ShardEthereum instance.

Thank you, R

prestonvanloon commented 6 years ago

Some of the key questions to answer:

terencechain commented 6 years ago

protocolManager lives in LesServer, where LesServer defines node server configuration, it hosts fields such as config, flowControl, private key and what topics it listens to. When we initialize a new LesServer in NewLesServer() we call NewProtocolManager() in handler.go

In server.go:

type LesServer struct {
    config          *eth.Config
    protocolManager *ProtocolManager
    fcManager       *flowcontrol.ClientManager 
    fcCostStats     *requestCostStats
    defParams       *flowcontrol.ServerParams
    lesTopics       []discv5.Topic
    privateKey      *ecdsa.PrivateKey
    quitSync        chan struct{}
}
func NewLesServer(eth *eth.Ethereum, config *eth.Config) (*LesServer, error) {
    pm, err := NewProtocolManager(eth.BlockChain().Config(), false, ServerProtocolVersions, config.NetworkId, eth.EventMux(), eth.Engine(), newPeerSet(), eth.BlockChain(), eth.TxPool(), eth.ChainDb(), nil, nil, quitSync, new(sync.WaitGroup))
        srv := &LesServer{protocolManager:  pm}
    pm.server = srv
    return srv, nil
}

protocolManager gives you access to txpool, chainConfig, chainDB and odr configs. NewProtocolManager returns a new sub protocol manager. The sub protocol manages peers capable with the ethereum network.

in handler.go

type ProtocolManager struct {
    lightSync   bool
    txpool      txPool
    txrelay     *LesTxRelay
    networkId   uint64
    chainConfig *params.ChainConfig
    blockchain  BlockChain
    chainDb     ethdb.Database
    odr         *LesOdr
    server      *LesServer
    serverPool  *serverPool
}
func NewProtocolManager(chainConfig *params.ChainConfig, lightSync bool, protocolVersions []uint, networkId uint64, mux *event.TypeMux, engine consensus.Engine, peers *peerSet, blockchain BlockChain, txpool txPool, chainDb ethdb.Database, odr *LesOdr, txrelay *LesTxRelay, quitSync chan struct{}, wg *sync.WaitGroup) (*ProtocolManager, error) {
    manager := &ProtocolManager{
        lightSync:   lightSync,
        blockchain:  blockchain,
        chainConfig: chainConfig,
        chainDb:     chainDb,
        odr:         odr,
        networkId:   networkId,
        txpool:      txpool,
        txrelay:     txrelay,
        peers:       peers,
        newPeerCh:   make(chan *peer),
    }

        // Initiate a subprotocols for every implemented version we can handle

    if lightSync {
        manager.downloader = downloader.New(downloader.LightSync, chainDb, manager.eventMux, nil, blockchain, removePeer)
        manager.peers.notify((*downloaderPeerNotify)(manager))
        manager.fetcher = newLightFetcher(manager)
    }

    return manager, nil
}

protocolManager is also used in fetcher.go The lightFetcher implements retrieval of newly announced headers. It also provides a peerHasBlock function for the ODR system to ensure that we only request data related to a certain block from peers who have already processed and announced that block

in fetcher.go

type lightFetcher struct {
    pm    *ProtocolManager
    odr   *LesOdr
    chain *light.LightChain
}
func newLightFetcher(pm *ProtocolManager) *lightFetcher {
    f := &lightFetcher{
        pm:             pm,
        chain:          pm.blockchain.(*light.LightChain),
        odr:            pm.odr,
        peers:          make(map[*peer]*fetcherPeerInfo),
        deliverChn:     make(chan fetchResponse, 100),
        requested:      make(map[uint64]fetchRequest),
        timeoutChn:     make(chan uint64),
        requestChn:     make(chan bool, 100),
        syncDone:       make(chan *peer),
        maxConfirmedTd: big.NewInt(0),
    }
    pm.peers.notify(f)

    f.pm.wg.Add(1)
    go f.syncLoop()
    return f
}

I think it makes the most sense to have a ShardServer (similar to LesServer), ShardServer will implement shardProtocolManager where it implements all the sharding protocol specs (similar to ProtocolManager) Then we can create sharding specific service like lightFetcher which uses shardProtocolManager

prestonvanloon commented 6 years ago

Protocol Manager API

The API for the ProtocolManager of the full geth node is:

type ProtocolManager
func NewProtocolManager(config *params.ChainConfig, mode downloader.SyncMode, networkId uint64, mux *event.TypeMux, txpool txPool, engine consensus.Engine, blockchain *core.BlockChain, chaindb ethdb.Database) (*ProtocolManager, error)
func (pm *ProtocolManager) BroadcastBlock(block *types.Block, propagate bool)
func (pm *ProtocolManager) BroadcastTxs(txs types.Transactions)
func (pm *ProtocolManager) NodeInfo() *NodeInfo
func (pm *ProtocolManager) Start(maxPeers int)
func (pm *ProtocolManager) Stop()

The LES version is slightly different:

type ProtocolManager
func NewProtocolManager(chainConfig *params.ChainConfig, lightSync bool, protocolVersions []uint, networkId uint64, mux *event.TypeMux, engine consensus.Engine, peers *peerSet, blockchain BlockChain, txpool txPool, chainDb ethdb.Database, odr *LesOdr, txrelay *LesTxRelay, quitSync chan struct{}, wg *sync.WaitGroup) (*ProtocolManager, error)
func (self *ProtocolManager) NodeInfo() *NodeInfo
func (pm *ProtocolManager) Start(maxPeers int)
func (pm *ProtocolManager) Stop()

What are the common params (not in order)?

Param Full node? LES Node?
config *params.ChainConfig Yes Yes
mode downloader.SyncMode Yes No
networkId uint64 Yes Yes
mux *event.TypeMux Yes Yes
txpool txPool Yes Yes
engine consensus.Engine Yes Yes
blockchain *core.BlockChain Yes No -- Why is this different?
chaindb ethdb.Database Yes Yes
lightSync bool No Yes
protocolVersions []uint No Yes
peers *peerSet No Yes
blockchain BlockChain No Yes -- Why is this different?
odr *LesOdr No Yes
txrelay *LesTxRelay No Yes
quitSync chan struct{} No Yes
wg *sync.WaitGroup No Yes

What are the likely params required by a sharding node?

These are common between LES and full Geth.

For sharding we would want to provide a new implemention of the concensus.Engine and ethdb.Database.

Some more questions to answer:

prestonvanloon commented 6 years ago

What are the params for sharding?

Param type What is it?
networkId int64 The network ID, used for P2P handshakes and peer acceptance
chainConfig (new!) *params.ChainConfig Sharding will have its own struct for a chain config. This will contain various data including the chain ID, fork block numbers, etc. See the existing implementation here for an idea.
mux *event.typeMux A TypeMux dispatches events to registered receivers. Receivers can be registered to handle events of certain type. Deprecated! Use event.Feed. Feed provides a similar one-to-many subscription channel.
txPool core.TxPool Note: LES has its own txPool impl TxPool contains all currently known transactions. Transactions enter the pool when they are received from the network or submitted locally. They exit the pool when they are included in the blockchain. The pool separates processable transactions (which can be applied to the current state) and future transactions. Transactions move between those two states over time as they are received and processed.
engine consensus.Engine This will almost certainly need to be new for sharding. Consensus works quite differently for sharding. Key question: Can we implement the same interface for sharding?
chaindb ethdb.Database Database wraps all database operations. This seems very similar to what we wrote as well! See: ShardBackend.
protocolVersions []uint The list of protocol messages that are supported by this particular client. If a peer also supports one of the protocol version then they are compatible with this client.
blockchain core.Blockchain BlockChain represents the canonical chain given a database with a genesis block. The Blockchain manages chain imports, reverts, chain reorganisations. Do we need a new implementation for sharding?
prestonvanloon commented 6 years ago

Summarizing the unanswered questions at this point:

terencechain commented 6 years ago

What is the difference between core.Blockchain and les.Blockchain?

core.Blockchain has access to the complete data set of the canonical chain starting from genesis block. core.Blockchain manages chain imports, reverts and reorgs. LightChain only represents a canonical chain that by default only handles block headers and it only does header validation during chain insertion.

Field Blockchain Lightchain
chainConfig *params.ChainConfig Yes No
cacheConfig *CacheConfig Yes No
db ethdb.Database Yes Yes
hc *HeaderChain Yes Yes
rmLogsFeed event.Feed Yes No
chainFeed event.Feed Yes Yes
chainSideFeed event.Feed Yes Yes
chainHeadFeed event.Feed Yes Yes
logsFeed event.Feed Yes No
scope event.SubscriptionScope Yes Yes
genesisBlock *types.Block Yes Yes
mu sync.RWMutex Yes Yes
chainmu sync.RWMutex Yes Yes
procmu sync.RWMutex Yes No
checkpoint int Yes No //light client don't use check point
currentBlock atomic.Value Yes No //light client don't care about block
currentFastBlock atomic.Value Yes No //light client don't care about block
stateCache state.Database Yes No //light client don't care about block
bodyCache *lru.Cache Yes Yes
bodyRLPCache *lru.Cache Yes Yes
blockCache *lru.Cache Yes Yes
futureBlocks *lru.Cache Yes No //lightchain don't care about future block
quit chan struct{} Yes Yes
running int32 Yes Yes
procInterrupt int32 Yes Yes
wg sync.WaitGroup Yes Yes
engine consensus.Engine Yes Yes
processor Processor Yes No //light client don't process
validator Validator Yes No //light client don't validate
vmConfig vm.Config Yes No
badBlocks *lru.Cache Yes No //light client don't care about block
odr OdrBackend No Yes //odr is only within les protocol

What are some sharding specific fields for Blockchain?

nisdas commented 6 years ago

How the trie is built is specified in the light package

type odrTrie struct {
    db   *odrDatabase
    id   *TrieID
    trie *trie.Trie
}

The trie is implemented the same way it is for les and full geth. The same can be said for the state which created using a given trie through func NewState(ctx context.Context, head *types.Header, odr OdrBackend) *state.StateDB

If you look at the backend for les

type OdrBackend interface {
    Database() ethdb.Database
    ChtIndexer() *core.ChainIndexer
    BloomTrieIndexer() *core.ChainIndexer
    BloomIndexer() *core.ChainIndexer
    Retrieve(ctx context.Context, req OdrRequest) error
}

the same db is implemented and accessed in the full client and light client. The only major difference I see would be in the syncing protocol for it.

For us in sharding the major changes I can see that we would have to be the trie structure and we would need a new implementation of a sharded db. Also state syncing would be another thing we would have to implement

rauljordan commented 6 years ago

State syncing in sharding would be different for notaries and proposers. Proposers would indeed need to be syncing with the shard they are processing tx's for and I can imagine we can follow certain similarities to geth here. For notaries, however, we have to be very efficient about notary burst overhead.

Basically, when a notary is selected in a shard, the notary has to sync as fast as possible with the shard chain, download collation headers, and use a fork choice rule. In this case, we have to work with a sync protocol that aligns well with rapid reshuffling across networks and has a solid cache mechanism.

@nisdas

For us in sharding the major changes I can see that we would have to be the trie structure and we would need a new implementation of a shard db. Also state syncing would be another thing we would have to implement.

So this means we can use the same interface for an OdrBackend as we will just have to modify the actual trie. Swapping the db will also be easy as there is already an interface there we can satisfy if we use something like badgerdb or redis.

@terenc3t

What are some sharding specific fields for Blockchain?

We will definitely need to modify the blockchain struct a lot as the fundamental structure of its primitives are different for sharding. We can implement something that at least satisfies most of the current interface, but the methods will most likely be very different.

@prestonvanloon

Can we use the consensus.Engine interface for sharding?

the consensus engine will life at the notary level, so we can perhaps keep this constrained to the notary package? Having something like a protocol manager does make sense in this case. Based on this analysis, we can follow a very similar structure to les, and we can probably get away with importing a lot of the interfaces and structs. However, there will definitely be a lot we will have to completely overhaul as our entire p2p layer is different and so much depends on it. Maybe this consensus engine can be renamed to something specific to the fork choice rule for shards exerted by notaries?

rauljordan commented 6 years ago

Thanks a lot for this guys - we're getting close. I think we can do very well if we can reuse a lot of the caching mechanisms Geth has as fields within a lot of these structs. Tests should also be easier to write as we will have clear references to existing tests if we import a lot of what Geth currently does.

nisdas commented 6 years ago

@rauljordan Yeap, thats right. We can reuse the same interface for our sharded db. A lot of the interfaces in the light and les package can be re-used for our purposes. This way seems much easier as instead of building from the ground up we can reuse a lot of the code in the les package and modify/add new code as required.

rauljordan commented 6 years ago

Hey all, the core.Blockchain struct should be very different than the one we use in the ShardProtocolManager. We will need some of the cache, sync, and checkpoint fields, but the implementations will be fundamentally different due to having different primitives.

A lot of the Blockchain api methods rely on the current implementation of the state, consensus mechanisms, and gas/execution concepts. These are fundamentally different in a sharded mechanism, so I don't think we will be importing core.Blockchain or its associated methods in our implementation. Instead, the ShardProtocolManager will contain a shard *Shard field.

prestonvanloon commented 6 years ago

Further discussion in the design doc: https://docs.google.com/document/d/1J4AEHTSKDGJpNzWS7ZXPqD0yEjqecb-xjdVkjGrxRMk/edit