hyperledger / fabric

Hyperledger Fabric is an enterprise-grade permissioned distributed ledger framework for developing solutions and applications. Its modular and versatile design satisfies a broad range of industry use cases. It offers a unique approach to consensus that enables performance at scale while preserving privacy.
https://wiki.hyperledger.org/display/fabric
Apache License 2.0
15.76k stars 8.86k forks source link

Remove LifecycleV20 application capability #4954

Closed denyeart closed 2 months ago

denyeart commented 3 months ago

Remove LifecycleV20 application capability.

This is the first step in removing all v1.x application capabilities.

v3.0 should not support v1.x application capabilities. Recall that the intent was to clean up old capability code when they are no longer needed. In v2.x both v1.x and v2.x capabilities were supported to assist with capability transition. In v3.x the v1.x application capabilities will not be supported and will be removed to clean up code.

Once the v1.x application capabilities are removed, the peer will be updated to not recognize application capabilities lower than V2_0_0. If a peer tries to start on a channel with a v1.x application capability it will not start.

We will document that user needs to update application capabilities on the channel to at least V2_0_0 before updating peer to v3.0.

This PR:

If there is general agreement on the approach for v1.x application capabilities, we can do the same for v1.x orderer capabilities and v1.x channel capabilities.

denyeart commented 3 months ago

@C0rWin I realize you don't intend to remove all the old lifecycle code since some of it is still used for fallback. However, I do think we should remove the v1.x capabilities since they are no longer supported. What do you think?

denyeart commented 2 months ago

One thing we need to consider is the case of a new v3.x peer joining an old channel that was originally at a v1.x capability but is now at a v2.x capability. If the new peer tries to join from genesis block it will not be able to process the v1.x capability blocks. The solution would be for the new peer to join from a more recent snapshot instead of the genesis block. But still, we need to determine if it is acceptable to force users into this approach. Any thoughts?

Also, since ordering nodes don't have the option to join from snapshot, can somebody remind me how ordering service channel join works? I believe you can join from a latest config block and prior blocks are not processed. Does this completely avoid the issue for ordering nodes?

yacovm commented 2 months ago

Also, since ordering nodes don't have the option to join from snapshot, can somebody remind me how ordering service channel join works? I believe you can join from a latest config block and prior blocks are not processed. Does this completely avoid the issue for ordering nodes?

Unless my memory betrays me (and I am writing from a laptop that has no IDE / Fabric repo cloned so I did not double check the code), prior blocks are replicated, of course. You only use the latest config block as a source of orderer endpoints + TLS CA certificates, but you should still verify the signatures of prior blocks. You verify the signature of prior blocks by maintaining a channel bundle and updating it every time you digest a config block. The genesis block is replicated without verification, and you "trust" that genesis block to be valid (which at first glance may seem naive, but remember that if an adversary gives you an incorrect genesis block, you wouldn't be able to bootstrap that orderer once it replicates to the config block it joined with, as an incorrect genesis block would mean the hash chain up to the block you joined with, is false). I think that if we remove the lifecycle capability, a V3.0 orderer may not succeed in initializing a channel bundle, as it would encounter a capability it's not aware of. I suggest that you add to this PR an integration test that has two Raft orderers with a filesystem with config blocks with V2 capability and add another Raft orderer which would replicate the blocks until a join block at channel level 3.

denyeart commented 2 months ago

Also, since ordering nodes don't have the option to join from snapshot, can somebody remind me how ordering service channel join works? I believe you can join from a latest config block and prior blocks are not processed. Does this completely avoid the issue for ordering nodes?

Unless my memory betrays me (and I am writing from a laptop that has no IDE / Fabric repo cloned so I did not double check the code), prior blocks are replicated, of course. You only use the latest config block as a source of orderer endpoints + TLS CA certificates, but you should still verify the signatures of prior blocks. You verify the signature of prior blocks by maintaining a channel bundle and updating it every time you digest a config block. The genesis block is replicated without verification, and you "trust" that genesis block to be valid (which at first glance may seem naive, but remember that if an adversary gives you an incorrect genesis block, you wouldn't be able to bootstrap that orderer once it replicates to the config block it joined with, as an incorrect genesis block would mean the hash chain up to the block you joined with, is false). I think that if we remove the lifecycle capability, a V3.0 orderer may not succeed in initializing a channel bundle, as it would encounter a capability it's not aware of. I suggest that you add to this PR an integration test that has two Raft orderers with a filesystem with config blocks with V2 capability and add another Raft orderer which would replicate the blocks until a join block at channel level 3.

Since LifecycleV20 is an application capability I'm pretty sure orderer node will ignore it. Orderer nodes only care about orderer and channel capabilities. But your comment still has merit, as I was considering removing all the v1.x capabilities including orderer capabilities such as V1_4_2. It sounds like this will be problematic even when joining from latest config block with a later capability. I think I'll abandon this PR and the related effort due to known and unknown impacts it may cause to existing users. While the original plan was to eventually retire old capabilities, their existence hasn't really caused much maintenance burden.

If somebody really thinks we should retire the v1 capabilities please speak up, otherwise I will close the PR in the coming days.

C0rWin commented 2 months ago

@C0rWin, I realize you don't intend to remove all the old lifecycle code since some of it is still used for fallback. However, I do think we should remove the v1.x capabilities since they are no longer supported. What do you think?

I was thinking about doing that, but then I realized there might be cases where we might need to upgrade the network and, for maintenance or restore purposes, replay the ledger, and if you remove the capabilities, it will just crash. What is your primary concern making you to remove it?

C0rWin commented 2 months ago

If somebody really thinks we should retire the v1 capabilities please speak up, otherwise I will close the PR in the coming days.

I will not change current capabilities unless they significantly burden users operationally.

denyeart commented 2 months ago

I agree @C0rWin . The motivation was to make the code cleaner, more readable, and more maintainable by removing code paths that are no longer needed. This was always the future vision when we started adding capabilities. But I agree, the old capabilities haven't been much burden and the ability to playback historical transactions can be important. Closing.