docker-archive / for-azure

27 stars 18 forks source link

azure:cloudstor plugin doesn't load storage correctly #39

Closed manixx closed 7 years ago

manixx commented 7 years ago

We experienced issues with the azure:cloudstor plugin where the plugin didn't load the Azure storage correctly.

We have a bunch of services, which use the same volume. We created the services using the docker stack deploy command.

I created a dummy container to check the loaded storage on two different nodes:

docker service create --constraint "node.id == 5ry73uzy3m4jf8p933civtbar" --mount type=volume,source=production_audio,destination=/audio --name logger --log-driver json-file alpine sh -c 'while true; do sleep 5; ls -l /audio; done'

docker service logs -f logger # no output
docker service create --constraint "node.id == qb4oajnqi8tc0wvegkr87ssmi" --mount type=volume,source=production_audio,destination=/audio --name logger --log-driver json-file alpine sh -c 'while true; do sleep 5; ls -l /audio; done'

docker service logs -f logger
logger.1.78dlp35p10kg@swarm-manager00000K    | drwxrwxrwx    2 root     root             0 Jul  4 09:49 projects
logger.1.78dlp35p10kg@swarm-manager00000K    | drwxrwxrwx    2 root     root             0 Sep  4 14:20 recordings
logger.1.78dlp35p10kg@swarm-manager00000K    | drwxrwxrwx    2 root     root             0 Jul  4 09:08 uploads
logger.1.78dlp35p10kg@swarm-manager00000K    | drwxrwxrwx    2 root     root             0 Sep  4 14:21 waveforms

We cannot reproduce this issue, it seems that it happens randomly and most of the time when we create a new node on the cluster.

ID                            HOSTNAME              STATUS              AVAILABILITY        MANAGER STATUS
4gi5kwzwlron5y7ekdrnnynm5     swarm-manager00000E   Ready               Active              Leader
5ry73uzy3m4jf8p933civtbar     swarm-manager00000J   Ready               Active              Reachable
hwb5qgfwqtfhko9w4y3lfsc62 *   swarm-manager00000H   Ready               Active              Reachable
qb4oajnqi8tc0wvegkr87ssmi     swarm-manager00000K   Ready               Active              Reachable
vj5ct7afr9u2syptiy3qe8nik     swarm-worker000006    Ready               Active              
z9lqn97sub3p2og7kx8ganni4     swarm-worker000005    Ready               Active              
OK hostname=swarm-manager00000E session=1506678273-FLLiB0hHe2gg6PtTOE3ygphZafxPqLZX
OK hostname=swarm-manager00000H session=1506678273-FLLiB0hHe2gg6PtTOE3ygphZafxPqLZX
OK hostname=swarm-manager00000J session=1506678273-FLLiB0hHe2gg6PtTOE3ygphZafxPqLZX
OK hostname=swarm-manager00000K session=1506678273-FLLiB0hHe2gg6PtTOE3ygphZafxPqLZX
OK hostname=swarm-worker000005 session=1506678273-FLLiB0hHe2gg6PtTOE3ygphZafxPqLZX
OK hostname=swarm-worker000006 session=1506678273-FLLiB0hHe2gg6PtTOE3ygphZafxPqLZX
Done requesting diagnostics.
Your diagnostics session ID is 1506678273-FLLiB0hHe2gg6PtTOE3ygphZafxPqLZX

Are there any known issues about that behaviour? Is there a way where I can check this or re-initialize the plugin? To fix this problem we have to create a new node and delete the old one.

ddebroy commented 7 years ago

Sounds like the engine bug with certain namespaces: https://github.com/docker/for-aws/issues/94

It should be fixed in latest versions of Docker.

ddebroy commented 7 years ago

Also this: https://forums.docker.com/t/azure-cloudstor-plugin-share-not-mounting/37752/3

manixx commented 7 years ago

Thanks for the update! We updated our cluster now and I hope this should be stable now.