container-storage-interface / spec

Container Storage Interface (CSI) Specification.
Apache License 2.0
1.32k stars 371 forks source link

1.10 generated files breaking changes #570

Open jsafrane opened 3 weeks ago

jsafrane commented 3 weeks ago

Update to https://github.com/container-storage-interface/spec/pull/552/ has broken quite some Kubernetes CSI sidecars.

So far I identified two issues:

  1. gomock cannot easily compare two messages now. Protobuf now stores some private stuff in state field, e.g. NodeGetInfoResponse: https://github.com/container-storage-interface/spec/blob/2f340622302a8d26950aba0f73f240ea1132a67b/lib/go/csi/csi.pb.go#L4827-L4830 And this private field has quite random values.

    It is possible to implement a custom gomock matcher that compares just the public fields, or use proto.Equal, as suggested in https://protobuf.dev/reference/go/faq/#deepequal.

  2. It's not possible to copy message content. For example, this code copies value of NodeGetInfoResponse:

    nodeInfo, err := ctrl.GetNodeInfo(grpcClient, *operationTimeout)
    if err != nil {
        klog.Fatalf("Failed to get node info from CSI driver: %v", err)
    }
    nodeDeployment.NodeInfo = *nodeInfo

    Such a copy will copy also the private fields and one of the fields is sync.Mutex, which should not be copied (govet complains hard about it). In this particular case, the Kubernetes external-provisioner can use a pointer, it does not need to copy the struct content.

Both issues are solvable using proper protobuf functions (WIP PR here), but they were quite surprising for a minor go package bump.

I am not asking for a revert, just for a better communication of the breaking changes and semantic versioning of go packages. This should not be a minor package bump. There should be probably a separate repo with the generated go code, so we can bump the versions independently on CSI spec. Kubernetes solved this by calling all its go packages 0.x, so no go API stability should be expected, and they break their API from time to time.

gnufied commented 3 weeks ago

I think generated mock defaults to Eq matcher and moving to a better matcher for protobuf matches is indeed correct solution. So we were probably using incorrect code all along. :(

The protobuf messages embedding RWMutex is a unfortunate decision, but its been taken 5 years ago - https://github.com/protocolbuffers/protobuf-go/commit/c37adefdac3634ee9b6f9494ff75001b93d01c28#diff-781ba79d06e543a67fe8ac84c0732cbcb8f27ed43123e3a3bb81bcb4e84c8a8b so... :/

jsafrane commented 2 weeks ago

I created https://github.com/container-storage-interface/spec/pull/572 to regenerate the structures in the old way. A proof of concept that it works is at https://github.com/kubernetes-csi/external-provisioner/pull/1266.

saad-ali commented 2 days ago

Thanks a lot @jsafrane for catching this and for helping put together a rollback PR.

We discussed this with at the k8s Storage SIG meeting and the CSI Community Meeting and got input from a CSI driver author (AWS EBS team) who went through a 1.10 version bump.

Based on those discussions, and given that:

  1. It is important to keep to an "up to date" version of our dependencies.
  2. The "on the wire" CSI RPC is backwards compatible and has not had breaking changes.
  3. These breaking changes are in code generation and are immediately obvious to SPs when they bump the API (build fails, there is no silent failure).

We have decided that we will keep the changes as is, and document them as breaking changes in release notes (and send out an email to notify folks).

jdef commented 1 day ago

Good call, Saad

On Thu, Sep 12, 2024, 2:42 AM Saad Ali @.***> wrote:

We discussed this with at the k8s Storage SIG meeting and the CSI Community Meeting and got input from a CSI driver author (AWS EBS team) who went through a 1.10 version bump.

Based on those discussions, and given that:

  1. It is important to keep to an "up to date" version of our dependencies.
  2. The "on the wire" CSI RPC is backwards compatible and has not had breaking changes.
  3. These breaking changes are in code generation and are immediately obvious to SPs when they bump the API (build fails, there is no silent failure).

We have decided that we will keep the changes as is, and document them as breaking changes in release notes (and send out an email to notify folks).

— Reply to this email directly, view it on GitHub https://github.com/container-storage-interface/spec/issues/570#issuecomment-2345397274, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR5KLFJOC2ANZB7XTVWW3DZWEZV5AVCNFSM6AAAAABM4BQXR2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBVGM4TOMRXGQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>