AWS backend attaches created volume to random swarm node

ClusterHQ / flocker

Container data volume manager for your Dockerized application

https://clusterhq.com

Apache License 2.0

3.39k stars 290 forks source link

AWS backend attaches created volume to random swarm node #2654

Open nikolai-derzhak-distillery opened 8 years ago

nikolai-derzhak-distillery commented 8 years ago

Our deployment script does:

docker volume create # to pass size and profile
docker run -v volume:path

So after step 1 we get new EBS volume attached to random node in swarm. So step 2 forces flocker to detach new volume then attach to specific node from (constraint:node=node_name);

Can you advice me how to avoid this extra ops (time consuming and AWS can fail to detach sometimes, also flocker can create some extra volumes due to confusion).

I believe best approach is to keep created volume detached until there is specific node to attach to.

wallnerryan commented 8 years ago

Hi @nderzhak,

One way you could do this is by

docker volume create #size and profile
view which node the volume is on using flockerctl or https://clusterhq.volumehub.com
docker run -v # with the constraint:node=node_name for where the volume is attached.

This is not ideal but would work. Right now we always attach to a swarm node, I will bring this information back to our development team to discuss. I will note that If the volume is detached your always going to need to call for an "attach" from AWS where if its already attached Swarm and choose the node it is attached to. Either way your calling operations from AWS:

Detach/Attach # in the case swarm chooses different node
(no-op) # in the case where swarm chooses correct node
Attach # always if volumes where to be detached on initial creation.

Per your comments (time consuming and AWS can fail to detach sometimes, also flocker can create some extra volumes due to confusion).

We are aware of the eventual consistency of AWS, Flocker itself is built to withstand some of the issues that arise in AWS such as failure to create/detatch etc. Have you had incidents where flocker has created multiple volumes or volumes failed to detatch? I would be interested in finding out more detail of these scenarios to see if we can help.

wallnerryan commented 8 years ago

Alternatively you could use docker run --volume-driver=flocker -v myvol:/mountpoint and the volume will be created where the container is run. Instead of separately with docker volume create then docker run

Or using flockerctl you can specify where the volume is attached to, then run docker run with constraint for the same node.

nikolai-derzhak-distillery commented 8 years ago

Hi Ryan,

I am aware that docker run --volume-driver=flocker -v myvol:/mountpoint will create new volume. But not sure how to set size and profile (EBS type) using this method.

Yes. I got one situation where volume stuck in detaching process and docker command (API call) timed out. So I had to repeat step.

Other case is when during (docker volume create; docker run -v ) steps one of volumes was duplicated.

So there were two volumes with the same name and id (I have checked on control node in curren*.json).

One was attached as expected and second detached (orphaned you know).

So I removed extra node to save money (we use PIOPS/gp2 1T volumes in production for elasticsearch cluster of 25 nodes).

I understand tradeoff of keeping volume detached on creation . But it seems as best approach when system do not know where it will be used. Chance that we hit the same node that the volume was randomly attached on creation is 1/N (where N is number of nodes). So will end up with extra detach/attach operations on any cluster bigger then 1 node :) with pretty big probability.

So we can claim this ticket as improvement request.

Regards, Nikolai

wallnerryan commented 8 years ago

Thanks Nikolai!

Agreed we will file it as that. Just as an update we mainly attach the volumes because we create a filesystem on the volume when its created. Rather, if we created and left detached we would need to create that filesystem on the volume when a container first used it. This is possible but was the initial design decision made.

Also, we are aware of the race condition where rarely you may end up with duplicate volumes. Currently its listed as "working on" in our JIRA so hopefully there will be a fix.

From your description, is it fair to say you are using flocker profiles in production for your elasticsearch cluster? Or are you exploring using Flocker?

thanks for your feedback. Ryan

nikolai-derzhak-distillery commented 8 years ago

I see. You need to format it also. Make sense. I am ok with that too.

Glad you are aware of race condition and have a the ticket also :)

We have prepared scripts to spin up prod elasticsearch cluster using swarm + flocker as volume driver. It looks like custom elasticsearch docker image plus couple of parametrised bash scripts to create nodes , generate and install flocker keys and services , then deploy and configure elasticsearch containers .

This week we will fill new cluster with data and put under production traffic load to see how it goes.

What I really like about flocker is support of different cloud provider/platforms and AWS backoff/retries (so we do not hit RateLimit like with rex-ray).

Great product for sure !

wallnerryan commented 8 years ago

@nderzhak Thanks for the feedback and glad you are enjoying it. Please continue reach out to us in anyway if we can help during your move to production or using flocker in general. Feel free to also ping us on freenode #clusterhq and support@clusterhq.com

@nderzhak also in the meantime, want to send a quick note to me (ryan.wallner@clusterhq.com), would love to ask a few questions about your use case and don't want it to eat up the github issue comments :)

Leaving this ticket open for now to see if anyone else in the community would like to see the same "volumes detached" feature.