confidential-containers / cloud-api-adaptor

Ability to create Kata pods using cloud provider APIs aka the peer-pods approach
Apache License 2.0
44 stars 71 forks source link

'genproto' workaround blocks Google Cloud imports #1212

Closed cfircohen closed 3 weeks ago

cfircohen commented 11 months ago

Hello, I'm adding Peer-pod support on GCP, see https://github.com/cfircohen/cloud-api-adaptor/tree/peerpod_gcp/gcp.

I'm currently blocked by issue 62 ("panic: protobuf tag not enough fields in Status.state") which has a workaround that replaces genproto with an older 2020 version. Workaround was removed in issue 175 ("go.mod contains unnecesary replace directive") and reintroduced in issue 180 ("Removal of the TTRPC workaround causes panic").

Google cloud GO SDK (cloud.google.com/go/compute) requires the latest genproto package: the older 2020 version doesn't have the necessary bindings, and downgrading GCP client SDK also doesn't work due to missing packages in the old 2020 genproto.

Please advise how to proceed.

bpradipt commented 11 months ago

@yoheiueda ptal

yoheiueda commented 11 months ago

I don't have a quick answer to this problem...

We could work around the issue, if golang could allow coexistance of multiple module versions in a single binary, but I guess it is not allowed in the go module system. https://go.dev/ref/mod#minimal-version-selection

The best solution is to eliminate the dependency in the Kata Containers.

I removed the replace line from Kata's go.mod file, and I was able to build the kata shim.

https://github.com/kata-containers/kata-containers/blob/7153b51578e8b24cf34a53134fcd0cb43060f1de/src/runtime/go.mod#L116

git clone -b CCv0 https://github.com/kata-containers/kata-containers.git
cd kata-containers/src/runtime
go mod edit -dropreplace google.golang.org/genproto
go mod tidy
go mod vendor
make $PWD/containerd-shim-kata-v2

I haven't tested its functionality yet.

If we can confirm that peer pods work with the modified kata shim, we can raise a PR to Kata Containers.

yoheiueda commented 11 months ago

The protobuf API for agent protocol defined in Kata is referenced in Cloud API Adaptor, so I think removing dependency to Kata is difficult for now. https://github.com/confidential-containers/cloud-api-adaptor/blob/1cd66fe3daadfbecc625ca3d465020b8284155be/pkg/adaptor/server.go#L16

cfircohen commented 11 months ago

I don't have a quick answer to this problem...

We could work around the issue, if golang could allow coexistance of multiple module versions in a single binary, but I guess it is not allowed in the go module system. https://go.dev/ref/mod#minimal-version-selection

The best solution is to eliminate the dependency in the Kata Containers.

I removed the replace line from Kata's go.mod file, and I was able to build the kata shim.

https://github.com/kata-containers/kata-containers/blob/7153b51578e8b24cf34a53134fcd0cb43060f1de/src/runtime/go.mod#L116

git clone -b CCv0 https://github.com/kata-containers/kata-containers.git
cd kata-containers/src/runtime
go mod edit -dropreplace google.golang.org/genproto
go mod tidy
go mod vendor
make $PWD/containerd-shim-kata-v2

I haven't tested its functionality yet.

If we can confirm that peer pods work with the modified kata shim, we can raise a PR to Kata Containers.

How do you normally test these changes? I assume Kata has an automatic e2e testing environment? Did all the unit-tests pass? Were you able to build a new podvm image and run "confidential-containers" e2e tests against it?

yoheiueda commented 11 months ago

How do you normally test these changes? I assume Kata has an automatic e2e testing environment? Did all the unit-tests pass? Were you able to build a new podvm image and run "confidential-containers" e2e tests against it?

I didn't tested the functionality when I posted my previous comment.

I just tested the updated go.mod using cloud-api-adaptor, and got the the following error.

2023/07/20 07:49:21 [adaptor/proxy] StartContainer: containerID:70d7e9067494ad82e66f684baf7f9917430cf2194b8823024f76e43b250a8422
panic: protobuf tag not enough fields in Status.state:

goroutine 71 [running]:
github.com/gogo/protobuf/proto.(*unmarshalInfo).computeUnmarshalInfo(0xc000616fa0)
    /root/go/pkg/mod/github.com/gogo/protobuf@v1.3.2/proto/table_unmarshal.go:341 +0x139f
github.com/gogo/protobuf/proto.(*unmarshalInfo).unmarshal(0xc000616fa0, {0xc0002e0ba0?}, {0xc000940582, 0x11, 0x558})
    /root/go/pkg/mod/github.com/gogo/protobuf@v1.3.2/proto/table_unmarshal.go:138 +0x67
github.com/gogo/protobuf/proto.makeUnmarshalMessagePtr.func1({0xc000940581, 0x12, 0x559}, {0x0?}, 0x0?)
    /root/go/pkg/mod/github.com/gogo/protobuf@v1.3.2/proto/table_unmarshal.go:1826 +0x151
github.com/gogo/protobuf/proto.(*unmarshalInfo).unmarshal(0xc000616f00, {0x1ce9700?}, {0xc000940580?, 0xc00062ada0?, 0x40d907?})
    /root/go/pkg/mod/github.com/gogo/protobuf@v1.3.2/proto/table_unmarshal.go:175 +0x35f
github.com/gogo/protobuf/proto.(*InternalMessageInfo).Unmarshal(0xc0002cc820?, {0x21ec1c0, 0xc0002cc1c0}, {0xc000940580?, 0x13?, 0x55a?})
    /root/go/pkg/mod/github.com/gogo/protobuf@v1.3.2/proto/table_unmarshal.go:63 +0xd0
github.com/gogo/protobuf/proto.(*Buffer).Unmarshal(0xc000565e18, {0x21ec1c0, 0xc0002cc1c0})
    /root/go/pkg/mod/github.com/gogo/protobuf@v1.3.2/proto/decode.go:424 +0x153
github.com/gogo/protobuf/proto.Unmarshal({0xc000940580, 0x13, 0x55a}, {0x21ec1c0, 0xc0002cc1c0})
    /root/go/pkg/mod/github.com/gogo/protobuf@v1.3.2/proto/decode.go:342 +0xe6
github.com/containerd/ttrpc.(*Client).recv(0xc0003d6220?, 0x0?, 0x0?)
    /root/go/pkg/mod/github.com/containerd/ttrpc@v1.1.0/client.go:378 +0xf5
github.com/containerd/ttrpc.(*Client).run.func2()
    /root/go/pkg/mod/github.com/containerd/ttrpc@v1.1.0/client.go:318 +0x24b
created by github.com/containerd/ttrpc.(*Client).run
    /root/go/pkg/mod/github.com/containerd/ttrpc@v1.1.0/client.go:286 +0x17b
exit status 2

I think Kata needs to migrate to the google/protobuf from gogo/protobuf, since goto/protobuf has been deprecated.

gogo/protobuf is widely used in Kata, so replacing all of them will require some work... https://github.com/search?q=repo%3Akata-containers%2Fkata-containers+gogo%2Fprotobuf&type=code

cfircohen commented 11 months ago

Thanks for testing, @yoheiueda. I see that @littlejawa had a comment here about Containerd moving away from gogo/protobuf in their main branch, and how Kata can now explore removing that dependency as well. Should we ping Julien and ask for status?

yoheiueda commented 11 months ago

@cfircohen That would be great.

I am trying to update the protobuf code in Kata Containers, but haven't succeeded yet. I am currently facing this issue https://github.com/containerd/containerd/issues/8850.

yoheiueda commented 11 months ago

To fix this issue, I believe we need to remove the dependency to gogo in Kata. I raised an issue about it https://github.com/kata-containers/kata-containers/issues/7420.

I think we need some volunteers to fix the issue in Kata, since it is complicated, and needs knowledge of both Kata and protobuf.

If fixing the issue in Kata needs some time, we may need to develop some work around in the cloud-api-adaptor side.

@bpradipt @stevenhorsman @littlejawa any comments?

beraldoleal commented 9 months ago

Hi @yoheiueda and @cfircohen , sorry for the late reply here.

Just to give an update: we are working to remove gogoprotobuff from Kata and make this move as soon as possible. It is a high-priority for us right now.

In meanwhile I will check with others if workarounds are possible here.

beraldoleal commented 3 weeks ago

genproto workaround is not in place since #1754. Closing this.