Shared volumes don't mount with `deploy: {mode: global}`?

jonaslb commented 7 months ago

Hi, again thanks for pioneering this functionality! I'm currently testing the nfs (and smb) drivers. What I've realised is that while e.g. deploy: {replicas: 3} in the service definition works fine (for any number), deploy: {mode: global} causes all tasks except one to be "Pending" and they never start. I don't know if it matters, but on this cluster, the driver is only installed on a selection of nodes (all of the manager nodes though), but I've also limited the service using placement constraints to only run on those, so I feel like it shouldn't be an issue, but maybe you know better.

So anyway, it's not a huge deal because where we use mode: global we probably shouldn't, but still I'd like to hear if you know why this happens and if there's anything that can be done about it when using the same shared volumes (besides not using that deploy mode!).

chrisbecke commented 7 months ago

replicas: 3 will happily start all 3 replicas on a single node if that is the only available node given the services constraints.

mode: global on the other hand acts like an implicit max_replicas_per_node: 1 is in play but it should only consider nodes that meed the explicit and implicit deployment constraints for scheduling.

jonaslb commented 7 months ago

I see what you're implying but no, the three replicas did not just all happen to start on a single functional node.

mode: global will happily schedule tasks on nodes without the driver, by the way. But as mentioned, I've used constraints on the labels to exclude those. docker service ps reveals that the tasks get scheduled on the correct nodes, where it works fine (ie. containers created and started) with replicas: N but not with mode: global.

jonaslb commented 7 months ago

Curiously, doing both at the same time (starting the same test-service with mode: global and replicas: N) allows the mode: global tasks to start just fine. This feels like a docker/swarmkit bug.

olljanat commented 7 months ago

Does the docker service ps --no-trunc <service name> tell anything about what it is pending?

However, might be that this is simply untested feature. Would need build Docker from code and add some extra logging to see where it fails.

jonaslb commented 7 months ago

Unfortunately no extra info with the flag. The full "Current State" description is "Preparing x minutes ago". No containers for the service appear with docker ps -a on the nodes.

When I start the service with replicas: N, and I look at the plugin logs (cat /var/run/docker/plugins/xxxxx/*std*) on a node that runs a task, I see this:

I0423 12:05:21.457352       8 utils.go:76] GRPC call: /csi.v1.Node/NodeStageVolume
I0423 12:05:21.457363       8 utils.go:77] GRPC request: {"secrets":"***stripped***","staging_target_path":"/data/staged/ivqh8gwq1911ol4m0tylqf3fw","volume_capability":{"AccessType":{"Mount":{}},"access_mode":{"mode":5}},"volume_context":{"ondelete":"retain","source":"//SMBHOST/SMBSHARE/","subdir":"subdir"},"volume_id":"SMBHOST/SMBSHARE#subdir#docker-volume-name#retain"}
I0423 12:05:21.564442       8 nodeserver.go:209] NodeStageVolume: targetPath(/data/staged/ivqh8gwq1911ol4m0tylqf3fw) volumeID(SMBHOST/SMBSHARE#subdir#docker-volume-name#retain) context(map[ondelete:retain source://SMBHOST/SMBSHARE/ subdir:subdir]) mountflags([]) mountOptions([])
I0423 12:05:21.856709       8 nodeserver.go:415] already mounted to target /data/staged/ivqh8gwq1911ol4m0tylqf3fw
I0423 12:05:21.856743       8 nodeserver.go:217] NodeStageVolume: already mounted volume SMBHOST/SMBSHARE#subdir#docker-volume-name#retain on target /data/staged/ivqh8gwq1911ol4m0tylqf3fw
I0423 12:05:21.856756       8 utils.go:83] GRPC response: {}
I0423 12:05:21.857515       8 utils.go:76] GRPC call: /csi.v1.Node/NodePublishVolume
I0423 12:05:21.857544       8 utils.go:77] GRPC request: {"secrets":"***stripped***","staging_target_path":"/data/staged/ivqh8gwq1911ol4m0tylqf3fw","target_path":"/data/published/ivqh8gwq1911ol4m0tylqf3fw","volume_capability":{"AccessType":{"Mount":{}},"access_mode":{"mode":5}},"volume_context":{"ondelete":"retain","source":"//SMBHOST/SMBSHARE/","subdir":"subdir"},"volume_id":"SMBHOST/SMBSHARE#subdir#docker-volume-name#retain"}
I0423 12:05:21.858098       8 nodeserver.go:81] NodePublishVolume: mounting /data/staged/ivqh8gwq1911ol4m0tylqf3fw at /data/published/ivqh8gwq1911ol4m0tylqf3fw with mountOptions: [bind] volumeID(SMBHOST/SMBSHARE#subdir#docker-volume-name#retain)
I0423 12:05:21.858129       8 mount_linux.go:218] Mounting cmd (mount) with arguments ( -o bind /data/staged/ivqh8gwq1911ol4m0tylqf3fw /data/published/ivqh8gwq1911ol4m0tylqf3fw)
I0423 12:05:21.859738       8 mount_linux.go:218] Mounting cmd (mount) with arguments ( -o bind,remount /data/staged/ivqh8gwq1911ol4m0tylqf3fw /data/published/ivqh8gwq1911ol4m0tylqf3fw)
I0423 12:05:21.861749       8 nodeserver.go:88] NodePublishVolume: mount /data/staged/ivqh8gwq1911ol4m0tylqf3fw at /data/published/ivqh8gwq1911ol4m0tylqf3fw volumeID(SMBHOST/SMBSHARE#subdir#docker-volume-name#retain) successfully
I0423 12:05:21.861785       8 utils.go:83] GRPC response: {}

When I start the service with mode: global, i only get this:

I0423 12:13:59.129434       8 utils.go:76] GRPC call: /csi.v1.Node/NodeUnpublishVolume
I0423 12:13:59.129451       8 utils.go:77] GRPC request: {"target_path":"/data/published/ivqh8gwq1911ol4m0tylqf3fw","volume_id":"SMBHOST/SMBSHARE#subdir#docker-volume-name#retain"}
I0423 12:13:59.129497       8 nodeserver.go:103] NodeUnpublishVolume: unmounting volume SMBHOST/SMBSHARE#subdir#docker-volume-name#retain on /data/published/ivqh8gwq1911ol4m0tylqf3fw
I0423 12:13:59.129538       8 utils.go:83] GRPC response: {}

which i find hard to make sense of. Essentially, it seems that docker treats these tasks and their volume mounts very differently depending on whether they are mode: global or not.

I'm not going to dig further into this, because the way out of this for me right now is to not use mode: global. But maybe this issue will be useful information for someone else in the future.

olljanat / csi-plugins-for-docker-swarm

Shared volumes don't mount with `deploy: {mode: global}`? #19