Support for different chains / hardfork switches

holgerd77 commented 6 years ago

API remains backwards-compatible.

This PR adds optional opts chain parameters for both the Block and the BlockHeader class, which default to mainnet if not set. This allows to set the genesis block according to the chain set. Chain-specific parameters are provided by the new ethereumjs-common library which also defines the supported chains.

Note: changes in the API docs diff are so large cause the docs for the BlockHeader class and functions were not included/forgotten before.

coveralls commented 6 years ago

Coverage increased (+2.3%) to 72.481% when pulling d21ce6ce2281f9d62b94672de927264a84fdc8bc on add-support-for-different-chains into 5450ad94ec05017cff0de01339ce35c7b9f005c5 on master.

vpulim commented 6 years ago

I'm not sure if embedding a Common instance within each block and header is the best approach. Besides the cost of allocating a new instance of Common with each block and header, the Common class should really be defining params at the blockchain level, and not the individual block and header level. Thus, I believe the Blockchain class is the one that should be creating a single instance of Common. It can then provide access to the common params to the Block and BlockHeader classes as needed.

The only dependency on the common params in the Block class is from the setGenesisParams() method. This can simply take an optional params parameter that defaults to the mainnet byzantium fork if missing. The Blockchain class would pass in the appropriate params object from a call to common.genesis() when setting the genesis block.

Block.prototype.setGenesisParams = function (params) {
  params = params || new Common('mainnet', 'byzantium').genesis()
  this.header.setGenesisParams(params)
}

If we add a new setGenesisParams() method to BlockHeader and pass params through, then we can completely remove the dependency on ethereumjs-common from index.js. Furthermore, if we make params a required argument to setGenesisParams(), then we can completely remove the dependency on ethereumjs-common from ethereumjs-block altogether.

The only dependency on the common params in the BlockHeader class is from the call to the validate() method. This method already has a reference to the blockchain from which it can grab the common params and then pass them on to validateDifficulty and validateGasLimit. This would require a new method on the Blockchain class (maybe something like getChainConfig()) that returns an instance of Common.

BlockHeader.prototype.validate = function (blockchain, height, cb) {
  ...
  const chainConfig = blockchain.getChainConfig()
  ...
    if (!self.validateDifficulty(parentBlock, chainConfig)) {
      return cb('invalid Difficulty')
    }

    if (!self.validateGasLimit(parentBlock, chainConfig)) {
      return cb('invalid gas limit')
    }
  ...
}

This approach would also be consistent with the Common.activeHardfork() method that only relies on the number property of a Block to determine which hardfork its in rather than directly inspecting the block. Embedding a Common instance within each block would add unnecessary redundancy and complexity.

holgerd77 commented 6 years ago

Not sure if I am going along with this. One general thing to consider: the ethereumjs-block module is not only meant to work in combination with the blockchain library. This is also used in a standalone context for just representing blocks (just a random example here). I don't know if much is won if in all these use cases the burden of creating a new Commons instance is loaded upon the caller.

That the commons class is defining constants on a blockchain level is not completely true, settings are only defined to be constant for a range of blocks through the hardfork setting. If the commons class is passed in from the outside (e.g. by the blockchain) it would have to be copied anyhow, otherwise parameters would change if a e.g. validataGasLimit() or other functions with params included are called after a hardfork update (with setHardfork()) on the commons class provided.

I would also like to keep the ability to explicitly define library compatibility with the newly introduced supportedHardforks initialization parameter. Atm this is just implicitly "baked" into the different ethereumjs libraries and if this is defined in the associated commons instance of a library object it would become very clear under what circumstances a library would work.

I actually thought about making this a bit the standard procedure, to set the network and hardfork in the opts in the constructor and then create a commons instance with e.g.

this.commons = new Commons(network, hardfork, ['byzantium', 'constantinople'])

An initialization with spuriousDragon e.g. would immediately throw and give the caller an immediate feedback and thus preventing undefined behaviour.

vpulim commented 6 years ago

the ethereumjs-block module is not only meant to work in combination with the blockchain library

This would be ideal. But unfortunately that is not currently the case since BlockHeader.validate() has a dependency on the blockchain. Go-ethereum avoids this by having a separate BlockValidator class. This makes sense since block validation is the only thing that varies by hardfork (within the Block class). The rest of the block and header code doesn't depend on which hardfork the block is on.

This is also used in a standalone context for just representing blocks (just a random example here)

Even in this context, its not truly standalone since the tx.validate() call that is made in devp2p depends on the current hardfork in order to run the correct validation logic (currently hardcoded to homestead). The devp2p code would have to maintain an instance of Common in order to determine the correct fork for the given block.

That the commons class is defining constants on a blockchain level is not completely true, settings are only defined to be constant for a range of blocks through the hardfork setting

Yes, this is not in dispute. Different settings should apply based on which hardfork the block is in. Sorry if I stated otherwise...

If the commons class is passed in from the outside (e.g. by the blockchain) it would have to be copied anyhow, otherwise parameters would change if a e.g. validataGasLimit() or other functions with params included are called after a hardfork update (with setHardfork()) on the commons class provided.

Instead of copying the object, a call to paramByBlock() could be used with BlockHeader.number passed in.

I actually thought about making this a bit the standard procedure, to set the network and hardfork in the opts in the constructor and then create a commons instance ... would immediately throw and give the caller an immediate feedback and thus preventing undefined behaviour.

Could this be done as a class or module method instead? Support for a particular set of hardforks wouldn't change based on instance.

One last concern (unless I'm misunderstanding something) is that code changes will be required in many places since the network and hardfork params will have to be passed in everywhere that a new Block or BlockHeader is constructed. This would be unfortunate given that only the validation methods are the ones that need to know which hardfork the block is on.

Thanks for listening and I'm obviously fine with whatever folks decide. I just wanted to throw in my two cents...

holgerd77 commented 6 years ago

@vpulim Thanks for the in-depth comments, have to think about these a bit before answering. I think we can really take some time for discussion here. Since the new commons library shall be applied on various libraries and "hold" on various use-cases I think we can ease future usage a lot and avoid potential caveats if we get the architecture adequately right.

holgerd77 commented 6 years ago

One thing I'm not sure about is if the hardfork choice should be really be bound tightly to the block number, I think for simulation contexts it is probably useful to just instantiate a Byzantium-compatible block.

I also think it can't be taken for granted regarding future hardforks that it remains the case that only the validation logic is HF-dependent, a future HF might e.g. (just taken randomly) change this._common.param('vm', 'maxExtraDataSize') to something else. There is also the blockhash refactoring on the horizon with Constantinople, not sure what this will bring, haven't digged into this yet. Just want to make sure that the structure allows for this.

vpulim commented 6 years ago

@holgerd77 Thanks, I appreciate your willingness to take the time to discuss and avoid future problems. I understand better now the benefits of your proposal in the simulation context. I also agree that block structure may change in future hardforks (I would lean towards having Block subclasses in that case, but your approach works just as well).

I'm heavily biased by the go-ethereum architecture (which decouples chain configs from blocks), but it's not necessarily the best approach.

holgerd77 commented 6 years ago

Hi @vpulim, thought about this a lot and did another take. It is now possible to instantiate a Block either with providing network and optionally now as well the hardfork or by directly passing a Common instance.

I dropped the default byzantium hardfork setting, the new behavior is the same regardless of the instantiation method described above: if the hardfork is set the block will behave like a hardfork-specific block regardless of the block number. If only the network is set parameters and capability is taken from the block number. For the validation code there will be an error for either non-Byzantium blocks or blocks with a non-byzantium block number. So no undefined behavior here. It also remains possible to instantiate a Block without any of the two instantiation additions, so just new Block(), this will work for everything and also just throw on validation.

Instantiation has gotten a bit more complex, I think this should be acceptable and worth it though since this allows for greater flexibilty. The instantiation with network/hardfork actually is more or less just semantic sugar, but I would like to leave this for comfort and for situations where it is nice to have explicitly new Block(null, { 'network': 'mainnet', 'hardfork': 'byzantium' } in the code to directly have references on how a block will behave.

I think this design will allow for:

Flexible block number dependent usage
Simulation contexts
Future switches on block changes
Avoidance of undefined behavior

Sorry if I write too much and maybe a bit confusing, didn't have that much time. This is still a bit rough maybe, I also need to add more test cases, just wanted to have an opinion on this first.

vpulim commented 6 years ago

@holgerd77 I like your new changes. I think this is a good solution and does provide much more flexibility.

By the way, this is minor feedback, but do you think "common" is the best name to use for the opts.common field? I wonder if something like opts.config would be more readable... I know the name "Common" derived from the original ethereum-common package, but it might be a little confusing when developers don't know the history and are only looking at the Block API.

holgerd77 commented 6 years ago

@vpulim Thanks for the feedback!

Regarding the naming: hmm, don't think it will get any better when this is renamed and differ from the ethereumjs-common library naming. Another option would be to rename the base library - I asked this myself as well a bit on a sidetrack but didn't picked it up further - but I would feel this would be a bit too much of a hazzle.

I'll make this make this clearer in the API comments though, you're right, this is not really understandable if one has no context.

vpulim commented 6 years ago

I'll make this make this clearer in the API comments though

👍

holgerd77 commented 6 years ago

Hi @vpulim, ok, I did the changes we discussed and also added the additional tests I promised, which got a serious bug caught - comparing the block number as a Buffer and not an Integer, which I also fixed along the way.

Can you do a (hopefully :-)) final review on this?

vpulim commented 6 years ago

Did another review pass and everything looks good to me 👍

ethereumjs / ethereumjs-block

Support for different chains / hardfork switches #44