ceph / ceph-csi

CSI driver for Ceph
Apache License 2.0
1.27k stars 539 forks source link

cephfsplugin in daemonset container restarted the app pod Transport endpoint is not connected #792

Closed jianglingxia closed 4 years ago

jianglingxia commented 4 years ago

Describe the bug

the daemonset not upgrade,but restart the csi-cephfsplugin container,like follow:

ae33cf5ab2d2 ed6f186ec08a "/usr/local/bin/cephc" 21 hours ago Up 21 hours k8s_csi-cephfsplugin_ceph1-csi-cephfsplugin-czl9r_default_3fd90197-32d4-11ea-9d2e-744aa4028226_0

[root@paas-controller-172-20-0-3:/home/ubuntu]$ docker restart ae33cf5ab2d2 ae33cf5ab2d2

the minion mountpath is : ceph-fuse 1.0G 0 1.0G 0% /paasdata/docker/plugins/kubernetes.io/csi/pv/pvc-8bdd24b9-3383-11ea-8500-744aa4028242/globalmount

the app nginx11-1-dp6lh use cephfs volume pvc-8bdd24b9-3383-11ea-8500-744aa4028242 nginx11-1-dp6lh 1/1 Running 0 9m2s 100.100.0.9 172.20.0.3

but the app named nginx11-1-dp6lh pod container error is df -h df: `/test': Transport endpoint is not connected Filesystem Size Used Avail Use% Mounted on rootfs 745G 50G 695G 7% / overlay 745G 50G 695G 7% / tmpfs 63G 0 63G 0% /dev tmpfs 63G 0 63G 0% /sys/fs/cgroup /dev/mapper/ncl-paasdata 745G 50G 695G 7% /dev/termination-log /dev/mapper/ncl-paasdata 745G 50G 695G 7% /etc/resolv.conf /dev/mapper/ncl-paasdata 745G 50G 695G 7% /etc/hostname ls -ih /

ls: cannot access /test: Transport endpoint is not connected

why the app pod /test mount path can not read and write and must restart the app container,thanks for your reply! A clear and concise description of what the bug is.

Environment details

Steps to reproduce

Steps to reproduce the behavior:

  1. Setup details: '...'
  2. Deployment to trigger the issue '....'
  3. See error

Actual results

Describe what happened

Expected behavior

A clear and concise description of what you expected to happen.

Additional context

Add any other context about the problem here.

For example:

Any existing bug report which describe about the similar issue/behavior

jianglingxia commented 4 years ago

/cephfs

jianglingxia commented 4 years ago

@huaizong @Madhu-1 thanks

jianglingxia commented 4 years ago

can anyone give some advice about the issue ,thanks ,the cephfs volume always existed the problem?

Madhu-1 commented 4 years ago

@jianglingxia this is a known issue if you are using cephfs-fuse client and daemonset container is restarted/upgraded,

@jianglingxia can you please switch to kernel client till we fix this issue in ceph-csi both for ceph-fuse and rbd-nbd

jianglingxia commented 4 years ago

@Madhu-1 thanks for your reply,

but Cephfs kernel client needs a higher kernel version >=4.17 to support quota management, and for kernel to resolve bugs maybe need wait some times and kernel client maybe cannot apply to most scenarios. when the CEPH fuse plan can be solved the problem? you think that only need to modify the CEPH client code?thanks

devopsjonas commented 4 years ago

I've started to play around with the simple fix. This involves running ceph csi plugin in systemd as a regular unit. The solution is similiar to https://github.com/kubernetes-sigs/alibaba-cloud-csi-driver/blob/master/docs/oss-upgrade.md

Example systemd unit:

[Unit]
Description=cephfsplugin
After=network.target

[Service]
Environment=CSI_ENDPOINT=unix:///var/lib/kubelet/plugins/csi-cephfsplugin/csi.sock
Environment=NODE_ID=centos-node-0
ExecStart=/usr/local/bin/cephcsi --nodeid=centos-node-0 --type=cephfs --nodeserver=true --endpoint=unix:///var/lib/kubelet/plugins/csi-cephfsplugin/csi.sock --v=5 --drivername=csi-cephfsplugin --metadatastorage=node --metricsport=8090 --metricspath=/metrics --enablegrpcmetrics=false
Restart=always

[Install]
WantedBy=multi-user.target

Now:

  1. I've also added a patch to run cephfs with systemd-run --scope - > https://github.com/ceph/ceph-csi/compare/master...devopyio:systemd-run
  2. copy the binary
  3. mkdir -p /tmp/csi/keys/
  4. create a /etc/ceph-csi-config/config.json with mon data
  5. Remove the cephfs-plugin from daemonset.

Now you should be able to run systemd mounts, restart the systemd service etc. Nothing seem to kill fuse mounts as it's run under different process:

           └─system.slice
             ├─csi-cephfsplugin.service
             │ └─9845 /usr/local/bin/cephcsi --nodeid=centos-node-0 --type=cephfs --nodeserver=true --endpoint=unix:///var/lib/kubelet/p
             ├─run-8666.scope
             │ └─8675 /usr/bin/ceph-fuse /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-8ccdb2c7-6c41-4001-8b52-f0a91834386c/globalmo

In order to make a patch upstream it would be nice to have release binaries with releases and it would need systemd run --scope patch.

Would ceph-csi maintainers would be happy with those patches?

Madhu-1 commented 4 years ago

I've started to play around with the simple fix. This involves running ceph csi plugin in systemd as a regular unit. The solution is similiar to https://github.com/kubernetes-sigs/alibaba-cloud-csi-driver/blob/master/docs/oss-upgrade.md

Example systemd unit:

[Unit]
Description=cephfsplugin
After=network.target

[Service]
Environment=CSI_ENDPOINT=unix:///var/lib/kubelet/plugins/csi-cephfsplugin/csi.sock
Environment=NODE_ID=centos-node-0
ExecStart=/usr/local/bin/cephcsi --nodeid=centos-node-0 --type=cephfs --nodeserver=true --endpoint=unix:///var/lib/kubelet/plugins/csi-cephfsplugin/csi.sock --v=5 --drivername=csi-cephfsplugin --metadatastorage=node --metricsport=8090 --metricspath=/metrics --enablegrpcmetrics=false
Restart=always

[Install]
WantedBy=multi-user.target

Now:

  1. I've also added a patch to run cephfs with systemd-run --scope - > master...devopyio:systemd-run
  2. copy the binary
  3. mkdir -p /tmp/csi/keys/
  4. create a /etc/ceph-csi-config/config.json with mon data
  5. Remove the cephfs-plugin from daemonset.

Now you should be able to run systemd mounts, restart the systemd service etc. Nothing seem to kill fuse mounts as it's run under different process:

           └─system.slice
             ├─csi-cephfsplugin.service
             │ └─9845 /usr/local/bin/cephcsi --nodeid=centos-node-0 --type=cephfs --nodeserver=true --endpoint=unix:///var/lib/kubelet/p
             ├─run-8666.scope
             │ └─8675 /usr/bin/ceph-fuse /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-8ccdb2c7-6c41-4001-8b52-f0a91834386c/globalmo

In order to make a patch upstream it would be nice to have release binaries with releases and it would need systemd run --scope patch.

Would ceph-csi maintainers would be happy with those patches?

Are you planning to run cephfs as systemd service on host machine rather than inside containers?

devopsjonas commented 4 years ago

Yes & deploy it with ansible on every host.

Would you be ok with an abilitiy to run this as systemd unit? For that I would need release binary, add an option to run fuse mount with systemd-run scope and some extra changes to allow changing paths for directories.

Madhu-1 commented 4 years ago

Yes & deploy it with ansible on every host.

Would you be ok with an abilitiy to run this as systemd unit? For that I would need release binary, add an option to run fuse mount with systemd-run scope and some extra changes to allow changing paths for directories.

AFAIK this won't be possible as this is against the CSI deployment strategy, we need to run ceph-csi as a container, there are many reasons not to run it as a systemd process on nodes.

cc @humblec @ShyamsundarR

devopsjonas commented 4 years ago

Then whats the plan to fix this? IMO this is a huge issue for fuse deamons as killing a pod with the driver kills the fuse daemons too and it will kill all mounts, possibly corrupting application data.

The only option I see is to run it with systemd. This is how other storage drivers have fixed it ( https://github.com/kubernetes-sigs/alibaba-cloud-csi-driver/blob/master/docs/oss-upgrade.md)

More info: https://github.com/kubernetes/kubernetes/issues/70013

humblec commented 4 years ago

@devopsjonas is it possible for you to try kcephfs mounter instead of fuse as previously suggested? because fuse or userspace deamon restart is a known issue as you noted and we dont have a complete solution atm. The delay in getting kernel feature support is something we have to keep in mind as you said, but we try our best to keep this delay short. We have other issues also when using fuse mounter with cephfs which is not visible with kernel client. The quota support is added in certain version of kernel but luckiluy its available with most of the distros.

devopsjonas commented 4 years ago

@devopsjonas is it possible for you to try kcephfs mounter instead of fuse as previously suggested? because fuse or userspace deamon restart is a known issue as you noted and we dont have a complete solution atm. The delay in getting kernel feature support is something we have to keep in mind as you said, but we try our best to keep this delay short. We have other issues also when using fuse mounter with cephfs which is not visible with kernel client. The quota support is added in certain version of kernel but luckiluy its available with most of the distros.

We are running centos 7 and I have no clue how to find out what features are supported on that etc, so we decided just to use fuse client. Maybe you have a link where we could look this information up? if thats pretty up to date we can definetely switch.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

stale[bot] commented 4 years ago

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.

strigazi commented 3 years ago

Just to clarify, this issue will not be fixed using the fuse client and we should change to the kernel one?

cc @humblec