hyperledger / fabric-protos

https://hyperledger.github.io/fabric-protos/
Apache License 2.0
41 stars 74 forks source link

Document approach to versioning and breaking changes #129

Closed jt-nti closed 2 years ago

jt-nti commented 2 years ago

It would be useful to agree how the APIs in this repo should be managed in terms of versioning and breaking changes.

Until recently there wasn't really any version involved and it was left as an exercise for the reader how to manage which proto definitions to use, resulting a range of approaches. Even for Go, where there was a pre-generated go bindings repository, a v0 pseudo-version number was used, with no protection against breaking changes.

With the recent introduction of pre-generated Java and Node bindings, a version number is required. So far this has been a pre-release v0 version to get the new CI pipelines working, which also include breaking change detection.

My assumption is that we should be avoiding breaking changes, which would make a simple v1 Go module version numbering system possible for all generated bindings.

There may be situations that genuinely require breaking changes, however these should be rare and involve serious thought about whether it is absolutely necessary. Should a breaking change be necessary, we would need to consider how the versioning system would handle the change. For example, Google's approach is that package names for versioned APIs must end with the version, and adding a version suffix to the generated Go module is one possible way to handle the Go module version numbering system.

bestbeforetoday commented 2 years ago

I agree we should not be making breaking changes. Absolutely positively not without a major version number change and associated package naming changes, as described by the Go module versioning guidance.

An example of why breaking changes are a bad idea is that both Fabric v2.5 and v3.0 are currently in active development. If the v3.0 development introduces a breaking change to the protobufs, that prevents the v2.5 (and v2.x in general) stream from making protobuf changes or picking up any new versions of the protobufs, since they are now broken for Fabric v2.

bestbeforetoday commented 2 years ago

The Go documentation has pretty strong recommendations on how to publish breaking changes in Go modules. We would need to publish both the "current" (Fabric v2.x) fabric-protos and a parallel v2 package structure for the Fabric v3.x version of the fabric-protos containing the breaking changes. We would need to publish both versions in Java and Node too, but that can just be done with semantic versioning of the published packages. There is a chunk of work involved for sure.

The easiest approach, which generally seems to be the one taken in the Go standard libraries, would be to not introduce breaking changes. This has very much been the approach in the protos up until now.

jt-nti commented 2 years ago

After an initial discussion with @denyeart @bestbeforetoday @mbwhite and @andrew-coleman the proposal is to follow the Go module versioning scheme and create a v1 release for Fabric v2.5, and a v2 release with the breaking changes for Fabric v3.

The easiest approach, which generally seems to be the one taken in the Go standard libraries, would be to not introduce breaking changes. This has very much been the approach in the protos up until now.

As Mark says, this is a significant change to the previous approach and will require some effort to implement.

cc @ale-linux, @C0rWin, @cendhu, @guoger, @manish-sethi, @yacovm for discussion.

Once we've confirmed the new approach to versioning, I'll document it and open the required issues to implement it.

Until the new versioning approach has been fully implemented, I would strongly recommend reverting 83a29816c327c778913d5755ab271bdfd0938731 since there is currently a risk of things getting more complex/broken with a Fabric v2.5 release.

denyeart commented 2 years ago

Thinking more about this... updating a module to v2 is a big deal in Go as it requires adding /v2 to the module path, which would cause large changes in every consumer due to all the file imports, including in fabric itself. At the same time, the breaking change being discussed here isn't a breaking change to Fabric external consumers such as SDKs and applications, it is a change that only impacts an internal consumer, the ordering service node. I think a good compromise would be to release fabric-protos and fabric-protos-go as v1.0 prior to the change (for Fabric v2.x to use), and as v1.1 after the change (for Fabric v3.x to use). Going from v0 to v1 doesn't impact the module path, and is probably the right thing to do regardless. Fabric and other consumers can then import an actual v1.x.x release, instead of using a pseudo version such as v0.0.0-20220827195505-ce4c067a561d.

bestbeforetoday commented 2 years ago

A minor version change implies that the changes guarantee backward compatibility and stability. I don't think this is the case here. The easy path suggestion of using a minor version number change still means we can't make breaking changes until after Fabric v2.5 is released, and after that point Fabric v2.5 cannot pick up any protobuf changes that might be required to address issues.

denyeart commented 2 years ago

The version number is primarily a signal to external consumers, and since there is no impact to external consumers such as SDKs and applications, I'm not seeing a need to increment the major version and cause disruption to every consuming repository. Ideally we'd split the API into internal and external APIs to make the versioning more clear for each, but that ship has sailed, I'd prefer the versioning reflect impact to external consumers.

The proposal would entail our typical use of release branches and could be implemented right now:

bestbeforetoday commented 2 years ago

I worry that this is subverting the Go module versioning contract by introducing breaking changes in a minor version increment just to save us effort. There is still significant amount of build / publish rework required. Perhaps just as much as going to v2. The simplest solution would be to not make a breaking change. Mark things we don't need in Fabric v3 as deprecated but leave them there, just like Go does for its standard libraries for the lifetime of the v1.x releases.

denyeart commented 2 years ago

Going to v2 is a significantly larger disruption since it would impact many files in each of the consumer repositories due to the import changes.

I do agree the simplest solution would be to not make a breaking change at all and just keep kafka in there but deprecated and not used. Let's see if the orderer folks would be agreeable to revert the change - @yacovm @tock-ibm

yacovm commented 2 years ago

I don't understand why we need to care about the module version.

If you want to use Fabric v3.0 you use the protos of branch main, or release 3.0 If you want to use Fabric v2.5 you use the protos of banch release-2.5

You can specify a branch or a commit when you vendor your dependency, can you not?

Why do we care about module versions here? This is not a library, it's a repository with a bunch of autogenerated code.

yacovm commented 2 years ago

Ideally we'd split the API into internal and external APIs to make the versioning more clear for each, but that ship has sailed

Exactly, this is the real problem here.

We have wire protocol between the clients and the nodes and we have wire protocol that is used among the nodes.

Historically everything was in fabric and then the SDKs were importing (or copying? I don't remember) Fabric's protos.

What did we do? Get all of the protobufs outside*, internal and external alike.

* modulo some protobufs of ledger which somehow survived the purge

yacovm commented 2 years ago

Until the new versioning approach has been fully implemented, I would strongly recommend reverting https://github.com/hyperledger/fabric-protos/commit/83a29816c327c778913d5755ab271bdfd0938731 since there is currently a risk of things getting more complex/broken with a Fabric v2.5 release.

You want to revert a change that doesn't break anything because there is a risk of things being complex or broken in the future?

denyeart commented 2 years ago

You can specify a branch or a commit when you vendor your dependency, can you not?

Technically you can, but this is typically done for pre-release/unstable builds. The protos have been stable for years. The Go convention for signaling a stable contract is to give it a semver version that consumers can import with confidence.

I think there is general agreement that there is value in being able to maintain the protos separately for Fabric v2.x and Fabric v3.x, so minimally we need a release branch for each. I also assert that we should version each with external consumers in mind. In my opinion the best way to signal to external consumers that there are two stable versions, and no changes that would break them, would be to label them v1.0.x (Fabric v2.x) and v1.1.x (Fabric v3.x).

jt-nti commented 2 years ago

One additional possibility is to leave Fabric < v2.x at a 0.x fabric-protos version (i.e. no guarantee of compatibility) and Fabric v3.x+ as a "properly" versioned v1.x, v2.x, etc. fabric-protos version. That kind of side steps the breaking change issue this time round but we would commit to not doing the same thing again without a major version change. Unfortunately it doesn't seem like we're willing to do that.

If we are only going to commit to versioning a subset of fabric-protos properly, we should define what parts will be compatible (and update the builds to verify that).

denyeart commented 2 years ago

@jt-nti I think your proposal is a decent compromise. Fabric v1.x and v2.x have a long history of not using a version number for the protos, so versioning the protos as v0.x (in a release branch) is a step forward in terms of clarity, both for 'internal' fabric consumers and 'external' application consumers that have previously relied on a Go pseudo version.

Similarly, for Fabric v3.x versioning the protos as v1.x (in main branch) would be a step forward and indicate that the protos are stable.

Going forward, I think the version should reflect the contract that 'external' SDK and application consumers depend on.

@yacovm @bestbeforetoday can you live with this compromise?

yacovm commented 2 years ago

I could live with anything you would do to Fabric, but I still don't understand what breaking changes you're talking about.

What impact on client side wire protocol is there for Fabric 3.x ? What are we deprecating that clients care about?

I suggest we not make changes that add more work to those working on Fabric.

bestbeforetoday commented 2 years ago

Starting to version existing (Fabric v2) published Go protobuf bindings as v0.x and the new (Fabric v3) protobufs at v1.x sounds like a pretty good solution to me. It acknowledges that changes to the protobufs have occurred while avoiding the pain of v2.x Go versioning.

The idea of setting the version based on only the external parts of the protobuf bindings sounds a little bit more tricky. For a Go module, everything publicly visible typically constitutes the externals. How do you suggest indicating which protobuf messages and/or packages are externals (so affect the versioning) and which are private / internal?

jt-nti commented 2 years ago

We could generate the new bindings for a subset (externals) of the proto files and enforce compatibility checks on those, and leave the old apiv1 Go bindings alone as an internal only un-versioned, no-guarantees module? (I'm attempting to try that out but hitting build issues on an arm mac.)

bestbeforetoday commented 2 years ago

I guess we would need to do similar splitting of external and internal for fabric-protos-go-apiv2? It's going to be pretty challenging to figure out what is potentially an external. Remember, anything that can be contained anywhere in the serialized-protobuf-within-protobuf block and private data structures are externals, as is anything passed in or out of a system chaincode transaction function or gRPC service provided by Fabric.

bestbeforetoday commented 2 years ago

I could live with anything you would do to Fabric, but I still don't understand what breaking changes you're talking about.

There is an issue of breaking changes in the protobufs in general that we've been ignoring for some time. This PR is a concrete example of a breaking change to published protobuf bindings that went out with Fabric v2.4:

https://github.com/hyperledger/fabric-sdk-go/pull/208

An update to newer Fabric protobuf bindings caused a public API change in fabric-sdk-go. This update was required because those same protobuf changes meant that an application making use of both fabric-sdk-go and fabric-gateway could see runtime failures due to differences between the versions of protobuf bindings on which they depend. This issue would exist for any other module with either direct dependency on fabric-protos-go, or indirect dependency on them by (for example) depending on core fabric, such as fabric-smart-client.

What impact on client side wire protocol is there for Fabric 3.x ? What are we deprecating that clients care about?

The wire format is pretty robust, provided the build enforces breaking changes are not made to the wire format and nobody disables that checking to get PRs in. The issue is more with the published protobuf bindings built from the protobuf definitions, as described above. This has implications beyond the wire format of protobuf messages. Things like Go module versioning semantics become an issue.

A solution could be to not publish any protobuf bindings and require all projects that make use of the Fabric protobuf definitions (including core Fabric) to compile and package their own version of the protobuf language bindings. This is what the legacy Node and Java SDKs did. However, this is a bit of a maintenance nightmare and is very unhelpful to application developers making use of Fabric protobuf messages, since they are now expected to compile their own protobuf bindings for use by their applications. I think a better approach is to publish the protobuf language bindings for use by Fabric projects and application developer alike (as we are doing today), and for us not to introduce breaking changes to the protobufs. Or at least apply appropriate versioning to the published bindings if/when breaking changes really are necessary.

jt-nti commented 2 years ago

Hmmm, good point, I'd forgotten about all the stealth wrapping of messages within binary blobs. Maybe it's a non-starter but I just went for everything in gateway, common, ledger, msp, and peer, plus the orderer/ab.proto file to try it out... #134

bestbeforetoday commented 2 years ago

Fabric v3 really should be updated at some point before release to depend on fabric-protos-go-apiv2 instead of fabric-protos-go, since the protobuf modules used by the latter are long deprecated. That might reduce the work involved in dealing with breaking changes between Fabric v2 and v3 protobuf definitions. Just remember that fabric-protos-go-apiv2 is already in use by fabric-gateway v1.1.0, which lines up with Fabric v2.4, so we can't ignore version inconsistencies completely there either. But we could deal with it only there and not worry about fabric-protos-go.

jt-nti commented 2 years ago

Fabric v3 really should be updated at some point before release to depend on fabric-protos-go-apiv2

That would only be possible if we don't make fabric-protos-go-apiv2 an external only module. Fabric v3 should probably generate its own bindings to get the internal stuff if we did go that way, which is sort of an inversion of what we had before.

mbwhite commented 2 years ago

I've just encountered a versioning issue that shows the challenge.

In an application that uses the fabric-token-sdk, I need to in my application add compensation for transitive dependencies. From an Engineering perspective, this is wrong; some these are now referring back to releases nearly 6 years old. This makes me a very nervous engineer - how fragile is this system? What if there's a security patch needed and some library needs to jump forward by 6 years.

What's worse is that these aren't my dependencies - but being forced by libraries in this case 2 or 3 levels of indirection down.

replace (
        github.com/go-kit/kit => github.com/go-kit/kit v0.7.0
        github.com/hyperledger/fabric => github.com/hyperledger/fabric v1.4.0-rc1.0.20210722174351-9815a7a8f0f7
        github.com/hyperledger/fabric-protos-go => github.com/hyperledger/fabric-protos-go v0.0.0-20201028172056-a3136dde2354
)

I've not followed the details of this discussion - but I would 110% support any scheme that gives proper SemanticVerstion control over external interfaces of any type or nature.

denyeart commented 2 years ago

The conversation here has meandered a bit. Let's get back to the main point which is defining a versioning approach for fabric-protos, fabric-protos-go, and related Java and Node published artifacts. I'll summarize the status quo and proposal that seemed to get the most consensus. We can handle additional items such as compatibility checks, apiv2, build approach in follow-on issues.

Status quo Use of release branches matching Fabric release branches, dependent projects reference a pseudo-version to point to a commit within the relevant release branch, e.g.:

Proposal

Each Fabric and SDK client repo can then reference a specific version (per release branch) rather than a pseudo-version for additional clarity.

The release branch approach is typical for Fabric repositories and enables 3rd digit fix maintenance without requiring consumers to move up to the next major.minor release.

Going forward, we follow semantic versioning rules. Given the mix of 'internal' and 'external' APIs defined in the repo, semantic versioning would be based on 'external' APIs, where 'external' APIs are defined as any API that an official hyperledger client SDK depends on (legacy or gateway SDK).

yacovm commented 2 years ago

For 0. and 1. the import path remains the same, right? It only changes when/if we want a version 2, correct?

bestbeforetoday commented 2 years ago

Proposal

  • release-2.4 gets versioned as v0.1.x.
  • release-2.5 gets versioned as v0.2.x.
  • main gets versioned as v1.0.x.

Each Fabric and SDK client repo can then reference a specific version (per release branch) rather than a pseudo-version for additional clarity.

Is the plan to use the same branches in the publish repositories (fabric-protos-go, fabric-protos-go-apiv2)? For example, Fabric v2.5 protos will be published from a release-2.5 branch in this repository to a release-2.5 branch in fabric-protos-go(-apiv2)?

How do consumers import the correct version? Will Fabric v2.5 depend on the release branch (github.com/hyperledger/fabric-protos@release-2.5) or a specific version tag?

Going forward, we follow semantic versioning rules. Given the mix of 'internal' and 'external' APIs defined in the repo, semantic versioning would be based on 'external' APIs, where 'external' APIs are defined as any API that an official hyperledger client SDK depends on (legacy or gateway SDK).

How do we identify and define what are internals and what are externals? Bear in mind that everything that can appear anywhere in a Block protobuf is an external visible to clients, as are all the gRPC services and complete message structures passed to or from those services. I suspect that leaves very little that is not part of the externals.

mbwhite commented 2 years ago

As a use case, I've had to recently rebuild the protos to experiment with a change around Fabric 2.5. The difficulty is what level of fabric-protos do I clone, to generate a library that Fabric2.5 itself can pull in. This isn't exactly obvious.

I think it might be helpful to list some detailed scenarios that show which branch code is built from, how the source is versions, what pulls that in with what version etc.

denyeart commented 2 years ago

@yacovm Correct v0.x and v1.x have no impact on import paths.

@bestbeforetoday @mbwhite We'd use consistent release branches across the repositories. e.g.: fabric-protos release-2.5 builds fabric-protos-go release-2.5 and is used in fabric release-2.5. This is already true as status quo. During development you could target a release branch commit with pseudo-version like today. For the ultimate Fabric release you'd target a fabric-protos-go versioned release. I expect we'd do the same with fabric-protos-go-apiv2. If there are no proto changes in a certain minor release of Fabric, we don't necessarily need to create the release branch in fabric-protos-*.

The plan is not to make breaking changes going forward, so there won't be many occurrences where we need to evaluate if a change impacts 'internal' versus 'external' APIs. The removal of kafka was one of these rare occurrences and in that case it was easy to determine that there is no impact to client SDKs.

mbwhite commented 2 years ago

@denyeart "Make it so" 👍

yacovm commented 2 years ago

@yacovm Correct v0.x and v1.x have no impact on import paths.

Then let's proceed as you suggest

jt-nti commented 2 years ago

@denyeart for this approach to work, we need to define what constitutes the 'external' API vs the 'internal' API, document each in some way (ideally making it very clear which parts of the API not to use), and update the builds with suitable breaking change detection. Who is able to define the internal/external split?

denyeart commented 2 years ago

@jt-nti In the proposal I said 'external' APIs are defined as any API that an official hyperledger client SDK depends on (legacy or gateway SDK). Agree that hardening that into the build is a nice-to-have, but need not block the incremental progress of shifting from status quo to versioning the protos.

jt-nti commented 2 years ago

@denyeart if we don't explicitly define what constitutes the external API in this repo, then we cannot claim to be following Go's semantic versioning rules, since there is essentially no way for anyone using the generated Go, Java and Node bindings to tell what might break in the future. To incrementally shift from no version to a v1 release, I think we need to version main as v0.3.x. for now, i.e. we make no guarantees about backward compatibility yet. We can move to v1 when/if we clarify what's in the external API.

denyeart commented 2 years ago

@jt-nti OK, I'm good with: