Docker fails to start when using awslogs

jacobwoffenden commented 6 years ago

Thanks for submitting an issue! Please fill in as much of the template below as you can.

------------- BUG REPORT TEMPLATE --------------------

What kops version are you running? The command kops version, will display this information. Version 1.8.0

What Kubernetes version are you running? kubectl version will print the version if a cluster is running or provide the Kubernetes version specified as a kops flag.

Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.4", GitCommit:"9befc2b8928a9426501d3bf62f72849d5cbcd5a3", GitTreeState:"clean", BuildDate:"2017-11-20T19:11:02Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.4", GitCommit:"9befc2b8928a9426501d3bf62f72849d5cbcd5a3", GitTreeState:"clean", BuildDate:"2017-11-20T05:17:43Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

What cloud provider are you using? aws

What commands did you run? What is the simplest way to reproduce this issue? Add the following to cluster spec via kops edit cluster ${NAME}:

additionalPolicies:
node: |
  [
    {
      "Effect": "Allow",
      "Action": ["logs:CreateLogGroup", "logs:CreateLogStream", "logs:PutLogEvents"],
      "Resource": ["*"]
    }
  ]
master: |
  [
    {
      "Effect": "Allow",
      "Action": ["logs:CreateLogGroup", "logs:CreateLogStream", "logs:PutLogEvents"],
      "Resource": ["*"]
    }
  ]
docker:
logDriver: awslogs
logOpt:
  - awslogs-create-group=true
  - awslogs-region=eu-west-2
  - awslogs-group=production

What happened after the commands executed? After applying a rolling update, the first master it recreates doesn't come back
What did you expect to happen? Kops applies awslogs docker opts

Please provide your cluster manifest. Execute kops get --name my.example.com -oyaml to display your cluster manifest. You may want to remove your cluster name and other sensitive information.

apiVersion: kops/v1alpha2
kind: Cluster
metadata:
creationTimestamp: 2017-12-10T20:21:01Z
name: ${CLUSTER_NAME}
spec:
additionalPolicies:
master: |
  [
    {
      "Effect": "Allow",
      "Action": ["logs:CreateLogGroup", "logs:CreateLogStream", "logs:PutLogEvents"],
      "Resource": ["*"]
    }
  ]
node: |
  [
    {
      "Effect": "Allow",
      "Action": ["logs:CreateLogGroup", "logs:CreateLogStream", "logs:PutLogEvents"],
      "Resource": ["*"]
    }
  ]
api:
loadBalancer:
  type: Public
authorization:
rbac: {}
channel: stable
cloudLabels:
environment: production
cloudProvider: aws
configBase: s3://kops-state-store/${CLUSTER_NAME}
dnsZone: ${CLUSTER_NAME}
docker:
logDriver: awslogs
logOpt:
- awslogs-create-group=true
- awslogs-region=eu-west-2
- awslogs-group=production
etcdClusters:
- etcdMembers:
- encryptedVolume: true
  instanceGroup: master-eu-west-2a-1
  name: a-1
- encryptedVolume: true
  instanceGroup: master-eu-west-2b-1
  name: b-1
- encryptedVolume: true
  instanceGroup: master-eu-west-2a-2
  name: a-2
name: main
- etcdMembers:
- encryptedVolume: true
  instanceGroup: master-eu-west-2a-1
  name: a-1
- encryptedVolume: true
  instanceGroup: master-eu-west-2b-1
  name: b-1
- encryptedVolume: true
  instanceGroup: master-eu-west-2a-2
  name: a-2
name: events
iam:
allowContainerRegistry: true
legacy: false
kubernetesApiAccess:
- xx.xx.xx.xx/32
kubernetesVersion: 1.8.4
masterInternalName: api.internal.${CLUSTER_NAME}
masterPublicName: api.production.${CLUSTER_NAME}
networkCIDR: 10.1.0.0/16
networking:
kuberouter: {}
nonMasqueradeCIDR: 100.64.0.0/10
sshAccess:
- xx.xx.xx.xx/32
subnets:
- cidr: 10.1.32.0/19
name: eu-west-2a
type: Public
zone: eu-west-2a
- cidr: 10.1.64.0/19
name: eu-west-2b
type: Public
zone: eu-west-2b
topology:
dns:
  type: Public
masters: public
nodes: public

Please run the commands with most verbose logging by adding the -v 10 flag. Paste the logs into this report, or in a gist and provide the gist link here. doesn't do anything

Anything else do we need to know?


root@ip-10-1-80-216:/home/admin# systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled)
Active: activating (auto-restart) (Result: exit-code) since Sun 2017-12-10 20:37:41 UTC; 1s ago
 Docs: https://docs.docker.com
Process: 2309 ExecStart=/usr/bin/dockerd -H fd:// $DOCKER_OPTS (code=exited, status=1/FAILURE)
Process: 2305 ExecStartPre=/opt/kubernetes/helpers/docker-prestart (code=exited, status=0/SUCCESS)
Main PID: 2309 (code=exited, status=1/FAILURE)

Dec 10 20:37:41 ip-10-1-80-216 systemd[1]: Failed to start Docker Application Container Engine. Dec 10 20:37:41 ip-10-1-80-216 systemd[1]: Unit docker.service entered failed state. root@ip-10-1-80-216:/home/admin# docker version Client: Version: 1.13.1 API version: 1.26 Go version: go1.7.5 Git commit: 092cba3 Built: Wed Feb 8 06:36:34 2017 OS/Arch: linux/amd64 error during connect: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.26/version: read unix @->/var/run/docker.sock: read: connection reset by peer root@ip-10-1-80-216:/home/admin# cat /etc/sysconfig/docker DOCKER_OPTS=--ip-masq=false --iptables=false --log-driver=awslogs --log-level=warn --log-opt=awslogs-create-group=true --log-opt=awslogs-group=production --log-opt=awslogs-region=eu-west-2 --storage-driver=overlay



------------- FEATURE REQUEST TEMPLATE --------------------

1. Describe IN DETAIL the feature/behavior/change you would like to see.

2. Feel free to provide a design supporting your feature request.

97turbotalon commented 6 years ago

I had the same issue and resolved it by removing awslogs-create-group=true and manually creating the cloudwatch group

jacobwoffenden commented 6 years ago

thanks @97turbotalon - that did the trick!

kubernetes / kops

Docker fails to start when using awslogs #4033