Over last 6 months I have encountered more than a few situations where the current most stable solution seems to be "rebuild swarm"
eg
Loss of SSH after scale set deallocation/reallocation
Loss of sudo after scale set reboot
Unable to upgrade to latest STABLE version 17.12.0 (with plenty of network fixes) as upgrade container delayed till 17.12.1
Loss of swarm when restarting dockerd (due to ghost containers)
Switching from EDGE to STABLE channel
Changing VM tshirt size
I dont mind rebuilding the swarm, as it means I can review/refactor/clean-up my configuration
Mostly all of my configuration is scripted/documented, so its not too much effort.
Create a new swarm from the template
Assign some node metadata
Create networks/volumes/secrets
Deploy the stacks
Update DNS to the new swarm public IP
The sticking point with a rebuild, would be relinking the data from first swarm to the second swarm.
I could not see guidance on how best to configure Azure to handle a swarm rebuild (rather than a swarm upgrade)
Which used the defaults provided by the template, where the resources all live in the same resource group
When I rebuild, I would need to preserve the data contained within the storage account RANDOMSTRING123
Azure Storage Explorer
My first thought would be to create the new swarm and copy the data using Azure Storage Explorer
Transfers should be free within the same region
Storage requirements would be doubled for a short time
This may only work while data size is small.
Override cloudstor:azure
My second thought would be to create the new swarm and override the default cloudstor:azure plugin with my own.
Using https://docs.docker.com/docker-for-azure/persistent-data-volumes/#use-a-different-storage-endpoint
(I have used a separate cloudstor:azure instance/storage for backups and that seems to work okay for short lived commands)
Not sure if overriding the default plugin instance is possible/stable/recommended.
There are a few issues on the forums where users are unable to re/create the plugin (error message is similar to "offer expired")
I am hesitant about overriding anything "default"
eg What happens if the default plugin instance needs to be changed/reset/locked-down as part of a future upgrade.
Separate cloudstor:azure
My third thought would be to store the swarm data in a separate 'named/aliased' cloudstor:azure instance
Either in the same or possibly a completely separate resource group
A separate resource group feels better from an isolation perspective, as that would allow me to completely purge the swarm resource group without data loss, no matter what future deployment restrictions are made on the docker swarm template/resource group.
As long as the custom cloudstor:azure plugin instance could always reach into another storage account.
Considering how quickly the platform changes, this third option seems the best.
Hi,
Over last 6 months I have encountered more than a few situations where the current most stable solution seems to be "rebuild swarm"
eg
I dont mind rebuilding the swarm, as it means I can review/refactor/clean-up my configuration
Mostly all of my configuration is scripted/documented, so its not too much effort.
The sticking point with a rebuild, would be relinking the data from first swarm to the second swarm. I could not see guidance on how best to configure Azure to handle a swarm rebuild (rather than a swarm upgrade)
My naive setup of the swarm was:
Swarm v17.09
Which used the defaults provided by the template, where the resources all live in the same resource group
When I rebuild, I would need to preserve the data contained within the storage account RANDOMSTRING123
Azure Storage Explorer
My first thought would be to create the new swarm and copy the data using Azure Storage Explorer Transfers should be free within the same region Storage requirements would be doubled for a short time This may only work while data size is small.
Override cloudstor:azure
My second thought would be to create the new swarm and override the default cloudstor:azure plugin with my own. Using https://docs.docker.com/docker-for-azure/persistent-data-volumes/#use-a-different-storage-endpoint (I have used a separate cloudstor:azure instance/storage for backups and that seems to work okay for short lived commands) Not sure if overriding the default plugin instance is possible/stable/recommended. There are a few issues on the forums where users are unable to re/create the plugin (error message is similar to "offer expired") I am hesitant about overriding anything "default" eg What happens if the default plugin instance needs to be changed/reset/locked-down as part of a future upgrade.
Separate cloudstor:azure
My third thought would be to store the swarm data in a separate 'named/aliased' cloudstor:azure instance Either in the same or possibly a completely separate resource group A separate resource group feels better from an isolation perspective, as that would allow me to completely purge the swarm resource group without data loss, no matter what future deployment restrictions are made on the docker swarm template/resource group. As long as the custom cloudstor:azure plugin instance could always reach into another storage account. Considering how quickly the platform changes, this third option seems the best.
I would then configure the swarm like this:
Swarm v17.12
However, I would not be allocating any volumes on the (default)cloudstor:azure plugin instance
Upgrading all my stack templates to use the new plugin instance should be straight forward.
Some questions would be: