docker-archive / for-azure

27 stars 18 forks source link

[Discussion] Swarm rebuild and best way to retain data #52

Open djeeg opened 6 years ago

djeeg commented 6 years ago

Hi,

Over last 6 months I have encountered more than a few situations where the current most stable solution seems to be "rebuild swarm"

eg

I dont mind rebuilding the swarm, as it means I can review/refactor/clean-up my configuration

Mostly all of my configuration is scripted/documented, so its not too much effort.

  1. Create a new swarm from the template
  2. Assign some node metadata
  3. Create networks/volumes/secrets
  4. Deploy the stacks
  5. Update DNS to the new swarm public IP

The sticking point with a rebuild, would be relinking the data from first swarm to the second swarm. I could not see guidance on how best to configure Azure to handle a swarm rebuild (rather than a swarm upgrade)

My naive setup of the swarm was:

Swarm v17.09

Which used the defaults provided by the template, where the resources all live in the same resource group

When I rebuild, I would need to preserve the data contained within the storage account RANDOMSTRING123

Azure Storage Explorer

My first thought would be to create the new swarm and copy the data using Azure Storage Explorer Transfers should be free within the same region Storage requirements would be doubled for a short time This may only work while data size is small.

Override cloudstor:azure

My second thought would be to create the new swarm and override the default cloudstor:azure plugin with my own. Using https://docs.docker.com/docker-for-azure/persistent-data-volumes/#use-a-different-storage-endpoint (I have used a separate cloudstor:azure instance/storage for backups and that seems to work okay for short lived commands) Not sure if overriding the default plugin instance is possible/stable/recommended. There are a few issues on the forums where users are unable to re/create the plugin (error message is similar to "offer expired") I am hesitant about overriding anything "default" eg What happens if the default plugin instance needs to be changed/reset/locked-down as part of a future upgrade.

Separate cloudstor:azure

My third thought would be to store the swarm data in a separate 'named/aliased' cloudstor:azure instance Either in the same or possibly a completely separate resource group A separate resource group feels better from an isolation perspective, as that would allow me to completely purge the swarm resource group without data loss, no matter what future deployment restrictions are made on the docker swarm template/resource group. As long as the custom cloudstor:azure plugin instance could always reach into another storage account. Considering how quickly the platform changes, this third option seems the best.

I would then configure the swarm like this:

Swarm v17.12

However, I would not be allocating any volumes on the (default)cloudstor:azure plugin instance

Upgrading all my stack templates to use the new plugin instance should be straight forward.

volumes:
  volsomename:
    name: 'somename'
    driver: cloudstor:safeazure

Some questions would be:

  1. Are there any online resources recommending the best approach?
  2. Are there downsides to the third approach?
  3. Would this make upgrading harder in the future?
  4. With the upcoming changes for virtual machine scale sets and attached storage, would a seperate resource group be better or worse?
  5. Would there be a extra performance hit for using a storage account in a seperate resource group?