kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
110.48k stars 39.5k forks source link

Upgrade bundled GlusterFS tools #43069

Closed bootc closed 6 years ago

bootc commented 7 years ago

I recently tried mounting a GlusterFS volume (run outside Kubernetes) within my K8s cluster, but mounting failed because of "Server is operating at an op-version which is not supported" errors. After much investigation, it appears that the reason is that the reason is due to old GlusterFS tools in the hyperkube image.

Kubernetes version (use kubectl version): Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.4+coreos.0", GitCommit:"97c11b097b1a2b194f1eddca8ce5468fcc83331c", GitTreeState:"clean", BuildDate:"2017-03-08T23:54:21Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}

Environment:

Anything else we need to know:

The underlying issue seems to be that hyperkube is built using a Debian Jessie (stable/8.7) image. Debian only has GlusterFS 3.5.2 in stable. GlusterFS 3.8.8 is available in Stretch (testing) as well as jessie-backports so it should be reasonably straightforward to pluck more recent versions of those packages for hyperkube.

The errors produced go into /var/lib/kubelet/plugins/kubernetes.io/glusterfs/glusterfsvol/glusterfs-glusterfs.log and look like:

[2017-03-10 09:43:33.094004] E [glusterfsd-mgmt.c:1297:mgmt_getspec_cbk] 0-glusterfs: failed to get the 'volume file' from server
[2017-03-10 09:43:33.094087] E [glusterfsd-mgmt.c:1388:mgmt_getspec_cbk] 0-mgmt: Server is operating at an op-version which is not supported
[2017-03-10 09:43:33.126728] E [glusterfsd-mgmt.c:1297:mgmt_getspec_cbk] 0-glusterfs: failed to get the 'volume file' from server
[2017-03-10 09:43:33.126783] E [glusterfsd-mgmt.c:1388:mgmt_getspec_cbk] 0-mgmt: Server is operating at an op-version which is not supported

Please consider upgrading these tools for interoperability with newer Gluster volumes.

I originally reported this as coreos/coreos-kubernetes#849.

rootfs commented 7 years ago

see #32686

humblec commented 7 years ago

@bootc @rootfs the latest debian packages are available @https://download.gluster.org/pub/gluster/glusterfs/3.10/LATEST/Debian/ , it looks like the BASEIMAGE/IMAGE referred in hypercube has older versions of it. I am not sure who maintain hypercube dockerfile. Need to loop the maintainer for getting it updated.

spiffxp commented 7 years ago

/sig storage

spiffxp commented 7 years ago

/sig release maybe because hyperkube

guydav commented 7 years ago

Hello,

Any updates regarding this issue? I ran into the exact same problem - my containers can mount GlusterFS volumes running on glusterfs 3.7.6, but cannot mount a newer server running glusterfs 3.10.2. I receive the same errors (from kubectl describe pod ...):

[2017-08-28 15:23:54.687805] E [glusterfsd-mgmt.c:1297:mgmt_getspec_cbk] 0-glusterfs: failed to get the 'volume file' from server
[2017-08-28 15:23:54.687836] E [glusterfsd-mgmt.c:1388:mgmt_getspec_cbk] 0-mgmt: Server is operating at an op-version which is not supported

Happy to provide the full details of my environment if it's useful, but I'm running on Azure.

Thanks!

humblec commented 7 years ago

Indeed. The version is really really old .. We even have v3.11 in upstream and these images are 3.5. @rootfs whom to ping on this ? many users are reporting issues due to this.

rootfs commented 7 years ago

@bootc @humblec I am not familiar with CoreOS/Debian packaging process. Is there anyway to build a customer package and install on your setup?

fejta-bot commented 6 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta. /lifecycle stale

fejta-bot commented 6 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten /remove-lifecycle stale

fejta-bot commented 6 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

mweichert commented 6 years ago

We're having the same problem. @rootfs @humblec are you able to re-open this issue? Glusterfs is about 5 minor versions behind, and one major version.