Closed sandys closed 6 years ago
hi @sandys I am currently on implementing exactly your wish 😃 Since one month to be honest.
Take a look at the swarm-approach2 branch of this repo.
ATM I expect that a swarm is already initialized with docker swarm init
and the daemon is already secured with TLS certificates. I do it manually with the remote-executioner
.
It is already possible to
docker compose
and stack
IMHO then I think we have to implement the compose features (spawn up a network, link containers) here in terraform
or am I completely wrong or do you see a better way?
I will take a look at the kubernetes
provider and see how they solved things like setting up a cluster there. Let's see what I can adapt.
BTW my first approach was to implement feature like swarm init
and join
in the provider but then I had to add multiple instances to the docker
provider which was a pain to handle all that in the provider.
@mavogel this is awesome! Can I make one request - the whole aspect of docker swarm init and join is the single most important thing that only something like terraform can do... because it is already aware of the multiple nodes/machines that are being orchestrated. In fact, I would argue that update for a swarm is not entirely needed in the beginning. You can destroy and recreate.
Docker Compose is a single machine stack - is there a particular reason you want to support them ? Because Swarm does the same thing and more. As long as you have the docker stack
commands working, everyone will be happy. NOTE: I dont understand what you mean by support "compose" - are you talking about docker compose
or the docker-compose.yml used by docker stack
?
Again I would like to reiterate - swarm init and join are the most important pieces in this. Writing multiple instances is not an issue!
Handling swarm init
and join
is possible in terraform
but not really nice. The devil is in the details.
Let me explain why I choose to expect the swarm to be already initialized:
For the docker provider one docker host is expected and on this host, all docker commands will be executed
# default aka the bootstrap node which initializes the swarm
provider "docker" {
host = "tcp://<docker-daemon-ip>:2376/"
}
Imagine now I want multiple docker hosts, which is the scenario in a swarm, I need multiple docker providers. Read here how terraform
can handle this.
provider "docker" {
alias = "worker_1"
host = "tcp://<docker-daemon-ip-worker-1>:2376/"
}
provider "docker" {
alias = "manager_1"
host = "tcp://<docker-daemon-ip-manager-2>:2376/"
}
# and so on...
# now ref to each alias
resource "docker_swarm_node" "bootstrap_node" {
is_bootstrap = true # which defaults to false
}
resource "docker_swarm_node" "worker_1" {
provider = "docker.worker_1"
token = "${docker_swarm_init.bootstrap_node.tokenworker}"
}
resource "docker_swarm_node" "manager_1" {
provider = "docker.manager_1"
token = "${docker_swarm_init.bootstrap_node.tokenmanager}"
}
# if I later want to ramp up a service with 3 replicas, I can make
# It just uses the daemon of the bootstrap node which distributes the replicas to the swarm
resource "docker_service" "service" {
name = "my-service"
image = "nginx"
replicas = "3"
}
Then several questions came up:
provider
with the IP. This is a drawback if you have versioned modules. Create a new tag each time an instance is added to the swarm?autoscaling groups
=> NoSo I tried it with multiple docker hosts on the start, which can variably change.
provider "docker" {
# checks if all possible docker hosts are pingable
hosts = ["${formatlist("tcp://%s:2376/", var.external_ips)}"]
}
But it went way to complex internally... see here
So I decided to handle the creation of the swarm
outside of the provider with terraforms remote-exec
provisioner:
$ sudo scp -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ~/${data.terraform_remote_state.iam.key_pair_id} ${var.user}@${aws_instance.swarm-bootstrap-manager.private_ip}:~/tokenmanager .
$ sudo docker swarm join --token $(cat ~/tokenmanager) ${aws_instance.swarm-bootstrap-manager.private_ip}:2377
For my case with having the swarm on AWS and using terraform.tfvars files I can easily add and remove nodes from the swarm by changing the number of instances in one place:
managers = "3"
workers = "5"
by changing the count
variable in each aws_instance
.
BTW @catsby could you give me your opinion about that as well. I'd be really curious about what you think about Approach 1. Does it make sense to additionally implement this functionality?
Regarding docker-compose
: I don't fully understand how it works but IMHO compose combines basic docker commands. Imagine your docker-compose.yml
file looks as follows:
backend:
image: redis:3
restart: always
frontend:
build: commander
links:
- backend:redis
ports:
- 8081:8081
environment:
- VAR1=value
restart: always
In terraform
you'd implement and the steps manually by packing both containers into a acommon network where the backend would be available as redis
in the frontend
container. So basically reimplementing the feature of compose. This is probably why the underlying go-dockerclient does not implement compose
/ stack
features
Update: moby-32781 will implement compose/stack
functionality on the daemon side
Let's wait for Clint's opinion and I'll add tests in the meantime.
@mavogel thank you for the detailed reply.
So Approach 3 is orthogonal to 1 or 2... since it would work regardless. So I would say go ahead with the functionality of 3 - it will not depend if you decide between 1 and 2. I hope you agree with that ?
Approach 2 will probably break existing users.
Approach 1 seems the safest. However, I think instead of tags.. you might need to create new provider types. This will allow you to have a list of masters (in case of multi-master swarm) and list of workers (P.S. please do account for worker and master being on the SAME node. We use this for developent).
Merged PR #40. Once the provider is released please test it if it fits your needs. I'll keep you updated once the release has happened. There is some infra CI stuff left todo...
This is super cool. Thanks !!!
On Wed 16 May, 2018, 21:33 Manuel Vogel, notifications@github.com wrote:
Merged PR #40 https://github.com/terraform-providers/terraform-provider-docker/pull/40. Once the provider is released please test it if it fits your needs. I'll keep you updated once the release has happened. There is some infra CI stuff left todo...
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/terraform-providers/terraform-provider-docker/issues/29#issuecomment-389573759, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEsU9YBjQlAFWKfsdeOSnkBT3qQWAlaks5tzE2_gaJpZM4ROQ4u .
Thanks to get merged !!! I have build terraform-provider-docker in my local and try to use docker_service resource.
resource "docker_service" "service" {
task_spec = [ "nginx" ]
name = "my-service"
image = "nginx"
replicas = "3"
}
Using this above resource i was getting below errors in plan
Error: docker_service.service: "task_spec": required field is not set Error: docker_service.service: : invalid or unknown key: image Error: docker_service.service: : invalid or unknown key: replicas
Then i changes it to very simple one
task_spec = [ "nginx" ]
name = "my-service"
}
now getting below error Error: docker_service.service: task_spec.0: expected object, got string
can you please help me to find one consolidated user Guide for this new docker_service resource.
Thanks a lot,
@FortuneLenovo no worries. The documentation of the terraform
website will be updated once the provider is released. Until then take a look at the tests how to configure a docker service. Especially the full
configuration can be helpful, where I added al the possible configuration values: https://github.com/terraform-providers/terraform-provider-docker/blob/master/docker/resource_docker_service_test.go#L127
HTH :)
In the below code under the resource docker_service
restart_policy {
condition = "on-failure"
delay = "3s"
max_attempts = 4
window = "10s"
}
if the status of condition was changed to other values like "always", "unless-stopped", "no" it throws an error, how do i update it ?
thanks.
@FortuneLenovo according to the Docker API 1.32
, which is currently implemented the only valid values are none
, on-failure
or any
for the condition
value.
Can you provide me a more detailed error? How is the value changed to always
?
thank you, that answers my question actually.
Hi,
Now i am able to successfully created docker_service also with replicas and restart condition. Further looking for docker_secret and docker_config to create these and utilize in my docker_service
So after going through docker_service_test.go and same for config i get to know how to create and use Once i am trying very simple string in data its working, but actually i am looking into how i can pass one file to config or secret (which is fairly easy to do in manually without terraform)
below one is my code with simple string in data section
resource "docker_secret" user-pass {
name = "user"
data = "pass"
}
resource "docker_config" site-conf {
name = "site-conf"
data = "site"
}
once i tried either long data string or try to pass on file got below error: Error: docker_config.site-conf: "data" is not base64 decodeable
thanks in advance !!!
Hi @FortuneLenovo ,
Docker needs the data of configs and secrets to be in base64 format. You could use terraform interpolations function
resource "docker_config" site-conf {
name = "site-conf"
data = "${base64encode("site")}"
}
HTH
Thank you for the help on secret key and it worked. When we configured restart policy in docker swarm mode its not reflecting to docker inspect means these are not attached to docker container, while we run container individually with restart policy it shows in inspect. So basically we want to have working restart policy even in case of docker swarm master is detached and containers should heal itself, is there a possible to achieve this ?
Thanks in advance!!!
@FortuneLenovo glad to hear that it helped :)
I assume you refer to the containers of docker services
in swarm mode right? Because then the restart policy
is attached to the service and not to each container.
Well I made a little POC and inspected the services. Note that I used the migration branch of https://github.com/terraform-providers/terraform-provider-docker/pull/70 to build the binary:
$ go build -o terraform-provider-docker_v1.0.0
# move to the local dir because it is no released yet to https://releases.hashicorp.com/
# adapt the directory '~/.terraform.d/plugins/darwin_amd64' accordingly
$ mv terraform-provider-docker_v1.0.0 ~/.terraform.d/plugins/darwin_amd64
$ docker service create --name redis --restart-condition=on-failure --restart-delay=3s --restart-max-attempts=4 --restart-window=10s redis:3.0.6
main.tf
:
provider "docker" {
version = "~> 1.0.0"
}
resource "docker_service" "foo" {
name = "redis-terraform"
task_spec {
container_spec {
image = "redis:3.0.6"
}
restart_policy {
condition = "on-failure"
delay = "3s"
max_attempts = 4
window = "10s"
}
}
}
$ terraform init
$ terraform apply
$ docker service inspect redis
$ docker service inspect redis-terraform
gives me
[
{
"ID": "zw9m7qykyv7pjketotmdvogwo",
"Version": {
"Index": 3945
},
"CreatedAt": "2018-06-06T07:30:32.4534158Z",
"UpdatedAt": "2018-06-06T07:30:32.4534158Z",
"Spec": {
"Name": "redis-terraform",
"Labels": {},
"TaskTemplate": {
"ContainerSpec": {
"Image": "redis:3.0.6",
"StopGracePeriod": 0,
"Healthcheck": {},
"DNSConfig": {},
"Isolation": "default"
},
"Resources": {},
"RestartPolicy": {
"Condition": "on-failure",
"Delay": 3000000000,
"MaxAttempts": 4,
"Window": 10000000000
},
"Placement": {},
"ForceUpdate": 0,
"Runtime": "container"
},
"Mode": {
"Replicated": {
"Replicas": 1
}
},
"UpdateConfig": {
"Parallelism": 1,
"FailureAction": "pause",
"Monitor": 5000000000,
"MaxFailureRatio": 0,
"Order": "stop-first"
},
"RollbackConfig": {
"Parallelism": 1,
"FailureAction": "pause",
"Monitor": 5000000000,
"MaxFailureRatio": 0,
"Order": "stop-first"
},
"EndpointSpec": {
"Mode": "vip"
}
},
"Endpoint": {
"Spec": {}
}
}
]
So I could not reproduce your error. Or did I get your question wrong? And my assumption was wrong.. Manu
Thanks for quick response, i understood concept that if we are running in swarm mode then restart_policy will apply to service not to container. So just want to know what will happen to failed containers if by chance swarm master node will disconnect from worker nodes that was the only master node.
In master disconnected scenario, will worker node try to restart a failed container according to restart_policy.
@FortuneLenovo this goes now deep into docker. Regarding your question, workers should apply the restart policy even if the swarm has no leader and or lost the quorum. Here are more docs about this topic. HTH
@mavogel do you any idea when this is going to be fully released approximately?
@Crapworks I hope by the end of next week... Still waiting for a review of #70 but everyone seems to be busy due to Hashidays... but I'll see them there and ask f2f :)
@mavogel Cool! Thanks for the heads up! This is awesome work and I'm looking forward using it in my next project!
@mavogel there is a little error in mapTypeMapValsToString which cause empty strings passes to docker service env. Fix below:
diff --git a/docker/resource_docker_container_funcs.go b/docker/resource_docker_container_funcs.go index 0ee3690..a9069ef 100644 --- a/docker/resource_docker_container_funcs.go +++ b/docker/resource_docker_container_funcs.go @@ -387,7 +387,7 @@ func mapTypeMapValsToString(typeMap map[string]interface{}) map[string]string {
// mapTypeMapValsToStringSlice maps a map to a slice with '=': e.g. foo = "bar" -> 'foo=bar'
func mapTypeMapValsToStringSlice(typeMap map[string]interface{}) []string {
- mapped := make([]string, len(typeMap))
+ mapped := make([]string, 0)
for k, v := range typeMap {
mapped = append(mapped, k+"="+v.(string))
}
@kristerr thank you for pointing out this bug. It will be addressed in #51 with tests for the next minor release
Version 1.0.0
got released https://github.com/terraform-providers/terraform-provider-docker/issues/29#issuecomment-400296076 :) Please try it out and give me feedback/issues/bug. Happy to fix your stuff and also happy if it just works :)
I have already checked with latest version code, its behaving same only. Waiting for you to test at your end, please
Thanks,
I also test https://github.com/terraform-providers/terraform-provider-docker/issues/29#issuecomment-400296076 it right now and it continue to add empty "" in env
hi, if we compare Kubernetes provider support (https://www.terraform.io/docs/providers/kubernetes/index.html) versus Docker support ( https://www.terraform.io/docs/providers/docker/index.html), we see that the Docker provider supports no Swarm features.
In fact, most people who work with swarm are forced to write it as "inline". Is there any chance we can get support for primitives like Swarm init, join, service create/delete/update ? This would be super awesome. If we are able to do something like https://github.com/docker/docker.github.io/blob/master/swarm/configure-tls.md using Terraform, that would be mindblowing.
There are a lot of us using Docker Swarm in production and this is one of the reasons why we are delaying adopting terraform. If these primitives get supported in Terraform, a lot of us would straight away use Terraform and give up using the docker-compose.yml files.