docker-archive / for-aws

92 stars 26 forks source link

Nodes not Staying Up in 17.06.01 Update #100

Open RehanSaeed opened 7 years ago

RehanSaeed commented 7 years ago

Expected behavior

Run latest 17.06.01 template with existing VPC and expect nodes to stay up.

Actual behavior

Nodes keep starting successfully and terminating continuously. I switched the auto-scaling group to use an EC2 health check type to get around this problem.

Today I noticed that one of my nodes has gone down. The AWS console tells me it's running but I can no longer SSH into it and docker node ls tells me it no longer exists.

Information

OK hostname=ip-10-2-1-141-bridgeinternationalacademies-com session=1505401254-T55CubZ8DlX43fK2j48s67qb7sTcVn6y
OK hostname=ip-10-2-2-46-bridgeinternationalacademies-com session=1505401254-T55CubZ8DlX43fK2j48s67qb7sTcVn6y
Done requesting diagnostics.
Your diagnostics session ID is 1505401254-T55CubZ8DlX43fK2j48s67qb7sTcVn6y
Please provide this session ID to the maintainer debugging your issue.

Steps to reproduce the behavior

  1. Start a new cluster with an existing VPC.
  2. Fail
kencochrane commented 7 years ago

@RehanSaeed You are using your existing VPC, are you making sure the load balancer can connect to the backend nodes in your security groups? It seems like it is failing because the ELB healthcheck can't reach the nodes.

Can you login to the nodes after you switch to the EC2 healthcheck, and if so, do they seem stable?

RehanSaeed commented 7 years ago

I'm using my own ALB and don't use the ELB v1 but that's the only thing I'm doing different from the default instantiation of the template. I added 'Target Groups' to the auto scaling group for the ALB, that might be what is causing the first issue (If an auto-scaling group has a classic load balancer and security groups setup, which load balancer health check does it use?).

The second issue still stands, after I switched to using the EC2 health check type. Was not able to login to one of my nodes but it was running perfectly fine two days ago when I last logged into it.

kencochrane commented 7 years ago

@RehanSaeed i'm not sure which load balancer health check it would use because you made changes to the template, it would depend on what you have put in the template.

As for the second issue, is it just one node, or more then one node. If you check the EC2 console logs in the AWS dashboard does it show anything?

RehanSaeed commented 7 years ago

Another node down this morning. I seem to get one going down every day. docker node ls tells me it's unreachable. Trying to SSH into it, gives me a connection refused error. I've already posted the output from docker-diagnose. Here are the AWS console logs from that node:

Linux version 4.9.41-moby (root@11fbdc1f630f) (gcc version 6.2.1 20160822 (Alpine 6.2.1) ) #1 SMP Sun Aug 20 10:48:18 UTC 2017
Command line: BOOT_IMAGE=/vmlinuz64 root=/dev/xvdb1 console=tty0 console=tty1 console=ttyS0 mobyplatform=aws vsyscall=emulate page_poison=1 initrd=/initrd.img
x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
x86/fpu: Using 'eager' FPU context switches.
e820: BIOS-provided physical RAM map:
BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable
BIOS-e820: [mem 0x000000000009e000-0x000000000009ffff] reserved
BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
BIOS-e820: [mem 0x0000000000100000-0x000000007fffffff] usable
BIOS-e820: [mem 0x00000000fc000000-0x00000000ffffffff] reserved
NX (Execute Disable) protection: active
SMBIOS 2.4 present.
Hypervisor detected: Xen
Xen version 4.2.
Netfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated NICs.
Blkfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated disks.
You might have to change the root device
from /dev/hd[a-d] to /dev/xvd[a-d]
in your root= kernel command line option
e820: last_pfn = 0x80000 max_arch_pfn = 0x400000000
x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- WT  
found SMP MP-table at [mem 0x000fbc20-0x000fbc2f] mapped at [ffff8c53c00fbc20]
RAMDISK: [mem 0x7c2f3000-0x7fffffff]
ACPI: Early table checksum verification disabled
ACPI: RSDP 0x00000000000EA020 000024 (v02 Xen   )
ACPI: XSDT 0x00000000FC00DDC0 000054 (v01 Xen    HVM      00000000 HVML 00000000)
ACPI: FACP 0x00000000FC00DA80 0000F4 (v04 Xen    HVM      00000000 HVML 00000000)
ACPI: DSDT 0x00000000FC001CE0 00BD19 (v02 Xen    HVM      00000000 INTL 20090123)
ACPI: FACS 0x00000000FC001CA0 000040
ACPI: FACS 0x00000000FC001CA0 000040
ACPI: APIC 0x00000000FC00DB80 0000D8 (v02 Xen    HVM      00000000 HVML 00000000)
ACPI: HPET 0x00000000FC00DCD0 000038 (v01 Xen    HVM      00000000 HVML 00000000)
ACPI: WAET 0x00000000FC00DD10 000028 (v01 Xen    HVM      00000000 HVML 00000000)
ACPI: SSDT 0x00000000FC00DD40 000031 (v02 Xen    HVM      00000000 INTL 20090123)
ACPI: SSDT 0x00000000FC00DD80 000031 (v02 Xen    HVM      00000000 INTL 20090123)
Zone ranges:
  DMA      [mem 0x0000000000001000-0x0000000000ffffff]
  DMA32    [mem 0x0000000001000000-0x000000007fffffff]
  Normal   empty
Movable zone start for each node
Early memory node ranges
  node   0: [mem 0x0000000000001000-0x000000000009dfff]
  node   0: [mem 0x0000000000100000-0x000000007fffffff]
Initmem setup node 0 [mem 0x0000000000001000-0x000000007fffffff]
ACPI: PM-Timer IO Port: 0xb008
IOAPIC[0]: apic_id 1, version 17, address 0xfec00000, GSI 0-47
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 low level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 low level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 low level)
Using ACPI (MADT) for SMP configuration information
ACPI: HPET id: 0x8086a201 base: 0xfed00000
smpboot: Allowing 15 CPUs, 14 hotplug CPUs
e820: [mem 0x80000000-0xfbffffff] available for PCI devices
Booting paravirtualized kernel on Xen HVM
clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
setup_percpu: NR_CPUS:128 nr_cpumask_bits:128 nr_cpu_ids:15 nr_node_ids:1
percpu: Embedded 35 pages/cpu @ffff8c543be00000 s105240 r8192 d29928 u262144
PV qspinlock hash table entries: 256 (order: 0, 4096 bytes)
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 515976
Kernel command line: BOOT_IMAGE=/vmlinuz64 root=/dev/xvdb1 console=tty0 console=tty1 console=ttyS0 mobyplatform=aws vsyscall=emulate page_poison=1 initrd=/initrd.img
PID hash table entries: 4096 (order: 3, 32768 bytes)
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
Memory: 1978332K/2096756K available (8381K kernel code, 1408K rwdata, 2836K rodata, 1388K init, 600K bss, 118424K reserved, 0K cma-reserved)
Hierarchical RCU implementation.
    Build-time adjustment of leaf fanout to 64.
    RCU restricting CPUs from NR_CPUS=128 to nr_cpu_ids=15.
RCU: Adjusting geometry for rcu_fanout_leaf=64, nr_cpu_ids=15
NR_IRQS:8448 nr_irqs:952 16
xen:events: Using 2-level ABI
xen:events: Xen HVM callback vector for event delivery is enabled
Console: colour VGA+ 80x25
console [tty0] enabled
Cannot get hvm parameter CONSOLE_EVTCHN (18): -22!
console [ttyS0] enabled
allocated 4194304 bytes of page_ext
clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 30580167144 ns
tsc: Fast TSC calibration using PIT
tsc: Detected 2400.108 MHz processor
Calibrating delay loop (skipped), value calculated using timer frequency.. 4800.10 BogoMIPS (lpj=24000500)
pid_max: default: 32768 minimum: 301
ACPI: Core revision 20160831
ACPI: 3 ACPI AML tables successfully acquired and loaded
Security Framework initialized
Yama: becoming mindful.
Mount-cache hash table entries: 4096 (order: 3, 32768 bytes)
Mountpoint-cache hash table entries: 4096 (order: 3, 32768 bytes)
CPU: Physical Processor ID: 0
Last level iTLB entries: 4KB 1024, 2MB 1024, 4MB 1024
Last level dTLB entries: 4KB 1024, 2MB 1024, 4MB 1024, 1GB 4
ftrace: allocating 37150 entries in 146 pages
smpboot: Max logical packages: 15
Switched APIC routing to physical flat.
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=0 pin2=0
clocksource: xen: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
installing Xen timer for CPU 0
smpboot: CPU0: Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz (family: 0x6, model: 0x3f, stepping: 0x2)
cpu 0 spinlock event irq 53
Performance Events: unsupported p6 CPU model 63 no PMU driver, software events only.
NMI watchdog: disabled (cpu0): hardware events not enabled
NMI watchdog: Shutting down hard lockup detector on all cpus
x86: Booted up 1 node, 1 CPUs
smpboot: Total of 1 processors activated (4800.10 BogoMIPS)
devtmpfs: initialized
x86/mm: Memory block size: 128MB
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
futex hash table entries: 4096 (order: 6, 262144 bytes)
NET: Registered protocol family 16
cpuidle: using governor ladder
cpuidle: using governor menu
ACPI: bus type PCI registered
PCI: Using configuration type 1 for base access
HugeTLB registered 2 MB page size, pre-allocated 0 pages
ACPI: Added _OSI(Module Device)
ACPI: Added _OSI(Processor Device)
ACPI: Added _OSI(3.0 _SCP Extensions)
ACPI: Added _OSI(Processor Aggregator Device)
ACPI: Interpreter enabled
ACPI: (supports S0 S5)
ACPI: Using IOAPIC for interrupt routing
PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
acpi PNP0A03:00: _OSC: OS supports [ASPM ClockPM Segments MSI]
acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM
acpi PNP0A03:00: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge.
PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7 window]
pci_bus 0000:00: root bus resource [io  0x0d00-0xffff window]
pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
pci_bus 0000:00: root bus resource [mem 0xf0000000-0xfbffffff window]
pci_bus 0000:00: root bus resource [bus 00-ff]
pci 0000:00:01.1: legacy IDE quirk: reg 0x10: [io  0x01f0-0x01f7]
pci 0000:00:01.1: legacy IDE quirk: reg 0x14: [io  0x03f6]
pci 0000:00:01.1: legacy IDE quirk: reg 0x18: [io  0x0170-0x0177]
pci 0000:00:01.1: legacy IDE quirk: reg 0x1c: [io  0x0376]
* Found PM-Timer Bug on the chipset. Due to workarounds for a bug,
* this clock source is slow. Consider trying other clock sources
pci 0000:00:01.3: quirk: [io  0xb000-0xb03f] claimed by PIIX4 ACPI
ACPI: PCI Interrupt Link [LNKA] (IRQs *5 10 11)
ACPI: PCI Interrupt Link [LNKB] (IRQs 5 *10 11)
ACPI: PCI Interrupt Link [LNKC] (IRQs 5 10 *11)
ACPI: PCI Interrupt Link [LNKD] (IRQs *5 10 11)
ACPI: Enabled 2 GPEs in block 00 to 0F
xen:balloon: Initialising balloon driver
xen_balloon: Initialising balloon driver
SCSI subsystem initialized
pps_core: LinuxPPS API ver. 1 registered
pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
PTP clock support registered
wmi: Mapper loaded
PCI: Using ACPI for IRQ routing
NetLabel: Initializing
NetLabel:  domain hash size = 128
NetLabel:  protocols = UNLABELED CIPSOv4
NetLabel:  unlabeled traffic allowed by default
HPET: 3 timers in total, 0 timers will be used for per-cpu timer
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
hpet0: 3 comparators, 64-bit 62.500000 MHz counter
clocksource: Switched to clocksource xen
FS-Cache: Loaded
CacheFiles: Loaded
pnp: PnP ACPI init
system 00:00: [mem 0x00000000-0x0009ffff] could not be reserved
system 00:01: [io  0x08a0-0x08a3] has been reserved
system 00:01: [io  0x0cc0-0x0ccf] has been reserved
system 00:01: [io  0x04d0-0x04d1] has been reserved
system 00:07: [io  0x10c0-0x1141] has been reserved
system 00:07: [io  0xb044-0xb047] has been reserved
pnp: PnP ACPI: found 8 devices
clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
NET: Registered protocol family 2
TCP established hash table entries: 16384 (order: 5, 131072 bytes)
TCP bind hash table entries: 16384 (order: 6, 262144 bytes)
TCP: Hash tables configured (established 16384 bind 16384)
UDP hash table entries: 1024 (order: 3, 32768 bytes)
UDP-Lite hash table entries: 1024 (order: 3, 32768 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
pci 0000:00:00.0: Limiting direct PCI/PCI transfers
pci 0000:00:01.0: PIIX3: Enabling Passive Release
pci 0000:00:01.0: Activating ISA DMA hang workarounds
pci 0000:00:02.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
Unpacking initramfs...
Freeing initrd memory: 62516K (ffff8c543c2f3000 - ffff8c5440000000)
RAPL PMU: API unit is 2^-32 Joules, 4 fixed counters, 655360 ms ovfl timer
RAPL PMU: hw unit of domain pp0-core 2^-14 Joules
RAPL PMU: hw unit of domain package 2^-14 Joules
RAPL PMU: hw unit of domain dram 2^-14 Joules
RAPL PMU: hw unit of domain pp1-gpu 2^-14 Joules
audit: initializing netlink subsys (disabled)
audit: type=2000 audit(1505227265.277:1): initialized
workingset: timestamp_bits=46 max_order=19 bucket_order=0
FS-Cache: Netfs 'nfs' registered for caching
NFS: Registering the id_resolver key type
Key type id_resolver registered
Key type id_legacy registered
nfs4filelayout_init: NFSv4 File Layout Driver Registering...
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
FS-Cache: Netfs 'cifs' registered for caching
ntfs: driver 2.1.32 [Flags: R/O].
fuse init (API version 7.26)
9p: Installing v9fs 9p2000 file system support
FS-Cache: Netfs '9p' registered for caching
NET: Registered protocol family 38
Key type asymmetric registered
Asymmetric key parser 'x509' registered
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 251)
io scheduler noop registered
io scheduler deadline registered (default)
io scheduler cfq registered
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
pciehp: PCI Express Hot Plug Controller Driver version: 0.4
shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
hv_vmbus: registering driver hyperv_fb
input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0
ACPI: Power Button [PWRF]
input: Sleep Button as /devices/LNXSYSTM:00/LNXSLPBN:00/input/input1
ACPI: Sleep Button [SLPF]
GHES: HEST is not enabled!
xen:xen_evtchn: Event-channel device installed
xen:grant_table: Grant tables using version 1 layout
Grant table initialized
Cannot get hvm parameter CONSOLE_EVTCHN (18): -22!
Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
00:06: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
Initializing Nozomi driver 2.1d
Non-volatile memory driver v1.3
Hangcheck: starting hangcheck timer 0.9.1 (tick is 180 seconds, margin is 60 seconds).
loop: module loaded
nbd: registered device at major 43
Invalid max_queues (4), will use default max: 1.
random: fast init done
VMware PVSCSI driver - version 1.0.7.0-k
hv_vmbus: registering driver hv_storvsc
scsi host0: ata_piix
scsi host1: ata_piix
ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc100 irq 14
ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc108 irq 15
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
tun: Universal TUN/TAP device driver, 1.6
tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
e1000: Copyright (c) 1999-2006 Intel Corporation.
e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
ixgbevf: Intel(R) 10 Gigabit PCI Express Virtual Function Network Driver - version 3.2.2-k
ixgbevf: Copyright (c) 2009 - 2015 Intel Corporation.
PPP generic driver version 2.4.2
PPP BSD Compression module registered
PPP Deflate Compression module registered
PPP MPPE Compression module registered
NET: Registered protocol family 24
PPTP driver version 0.8.5
VMware vmxnet3 virtual NIC driver - version 1.4.a.0-k-NAPI
xen_netfront: Initialising Xen virtual ethernet driver
blkfront: xvdb: barrier or flush: disabled; persistent grants: disabled; indirect descriptors: enabled;
hv_vmbus: registering driver hv_netvsc
Fusion MPT base driver 3.04.20
Copyright (c) 1999-2008 LSI Corporation
Fusion MPT SPI Host driver 3.04.20
aoe: AoE v85 initialised.
i8042: PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
 xvdb: xvdb1
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
hv_vmbus: registering driver hyperv_keyboard
mousedev: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input2
input: PC Speaker as /devices/platform/pcspkr/input/input3
rtc_cmos 00:02: rtc core: registered rtc_cmos as rtc0
rtc_cmos 00:02: alarms up to one day, 114 bytes nvram, hpet irqs
i2c /dev entries driver
hv_utils: Registering HyperV Utility Driver
hv_vmbus: registering driver hv_util
hv_vmbus: registering driver hv_balloon
oprofile: using timer interrupt.
GACT probability on
Mirror/redirect action on
Simple TC action Loaded
netem: version 1.3
u32 classifier
    Performance counters on
    input device check on
    Actions configured
Netfilter messages via NETLINK v0.30.
nfnl_acct: registering with nfnetlink.
nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
ctnetlink v0.93: registering with nfnetlink.
nf_tables: (c) 2007-2009 Patrick McHardy <kaber@trash.net>
nf_tables_compat: (c) 2012 Pablo Neira Ayuso <pablo@netfilter.org>
xt_time: kernel timezone is -0000
ip_set: protocol 6
IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
IPVS: Connection hash table configured (size=4096, memory=64Kbytes)
IPVS: Creating netns size=2104 id=0
IPVS: ipvs loaded.
IPVS: [rr] scheduler registered.
IPVS: [wrr] scheduler registered.
IPVS: [lc] scheduler registered.
IPVS: [wlc] scheduler registered.
IPVS: [fo] scheduler registered.
IPVS: [ovf] scheduler registered.
IPVS: [lblc] scheduler registered.
IPVS: [lblcr] scheduler registered.
IPVS: [dh] scheduler registered.
IPVS: [sh] scheduler registered.
IPVS: [sed] scheduler registered.
IPVS: [nq] scheduler registered.
IPVS: ftp: loaded support on port[0] = 21
ipip: IPv4 and MPLS over IPv4 tunneling driver
gre: GRE over IPv4 demultiplexor driver
ip_gre: GRE over IPv4 tunneling driver
IPv4 over IPsec tunneling driver
ip_tables: (C) 2000-2006 Netfilter Core Team
ipt_CLUSTERIP: ClusterIP Version 0.8 loaded successfully
arp_tables: arp_tables: (C) 2002 David S. Miller
Initializing XFRM netlink socket
NET: Registered protocol family 10
mip6: Mobile IPv6
ip6_tables: (C) 2000-2006 Netfilter Core Team
sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
ip6_gre: GRE over IPv6 tunneling driver
NET: Registered protocol family 17
NET: Registered protocol family 15
Bridge firewalling registered
Ebtables v2.0 registered
l2tp_core: L2TP core driver, V2.0
l2tp_ppp: PPPoL2TP kernel driver, V2.0
8021q: 802.1Q VLAN Support v1.8
9pnet: Installing 9P2000 support
Key type dns_resolver registered
openvswitch: Open vSwitch switching datapath
mpls_gso: MPLS GSO support
microcode: sig=0x306f2, pf=0x1, revision=0x3a
microcode: Microcode Update Driver: v2.01 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
AVX2 version of gcm_enc/dec engaged.
AES CTR mode by8 optimization enabled
registered taskstats version 1
Key type big_key registered
Key type encrypted registered
rtc_cmos 00:02: setting system clock to 2017-09-12 14:41:05 UTC (1505227265)
Freeing unused kernel memory: 1388K (ffffffff89f62000 - ffffffff8a0bd000)
Write protecting the kernel read-only data: 14336k
Freeing unused kernel memory: 1844K (ffff8c5404833000 - ffff8c5404a00000)
Freeing unused kernel memory: 1260K (ffff8c5404cc5000 - ffff8c5404e00000)

   OpenRC 0.21.7.e3f10ac is starting up Linux 4.9.41-moby (x86_64)

 * Mounting /proc ... [ ok ]
 * Mounting /run ... * /run/openrc: creating directory
 * /run/lock: creating directory
 * /run/lock: correcting owner
 * Mounting xenfs ... [ ok ]
 * Caching service dependencies ... [ ok ]
 * Mounting /sys ... [ ok ]
 * Mounting security filesystem ... [ ok ]
 * Mounting debug filesystem ... [ ok ]
 * Mounting fuse control filesystem ... [ ok ]
 * Mounting persistent storage (pstore) filesystem ... [ ok ]
 * Mounting cgroup filesystem ... [ ok ]
 * Mounting devtmpfs on /dev ... [ ok ]
 * Mounting /dev/mqueue ... [ ok ]
 * Mounting /dev/pts ... [ ok ]
 * Mounting /dev/shm ... [ ok ]
 * Starting busybox mdev ... [ ok ]
 * Configuring host block device .../dev/xvdb1: clean, 16/65280 files, 26184/261048 blocks
Resizing disk partition: Unpartitioned space /dev/xvdb: 19 GiB, 20405288960 bytes, 39854080 sectors
The bootable flag on partition 1 is enabled now.

The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
/dev/xvdb1: 16/65280 files (0.0% non-contiguous), 26184/261048 blocks
resize2fs 1.43.3 (04-Sep-2016)
Resizing the filesystem on /dev/xvdb1 to 5242872 (4k) blocks.
The filesystem on /dev/xvdb1 is now 5242872 (4k) blocks long.

/dev/xvdb1: clean, 16/1305600 files, 104782/5242872 blocks
 [ ok ]
 * Loading hardware drivers ...modprobe: module fbcon not found in modules.dep
 [ ok ]
 * Remounting filesystems ... [ ok ]
 * Mounting local filesystems ... [ ok ]
 * Setting hostname ... [ ok ]
 * Creating user login records ... [ ok ]
 * sysklogd -> start: syslogd ... [ ok ]
 * sysklogd -> start: klogd ... [ ok ]
 * Starting busybox crond ... [ ok ]
 * Mounting misc binary format filesystem ... [ ok ]
 * Setting sysfs variables ... [ ok ]
 * Starting local ... [ ok ]
 * Configuring kernel parameters ... [ ok ]
 * Starting DHCP Client Daemon ...eth0: eth0: MTU set to 9001
 [ ok ]
 * Starting diagnostics server ... [ ok ]
 * Starting networking ... *   lo ... [ ok ]
 * Starting busybox acpid ... [ ok ]
 * Running system containerd ... [ ok ]
 * Running system containers ... binfmt rngd [ ok ]
 * Setting up database (AWS) ... [ ok ]
 * Configuring host settings from database ... [ ok ]
 * Starting Docker ... [ ok ]
 * Running AWS-specific initialization ...passwd: password for docker changed by root
 * Pre-load docker images ...Loading image /dockerimages/shell-aws.tar


040fd7841192: Loading layer  65.54kB/4.234MB

040fd7841192: Loading layer  4.234MB/4.234MB


ea9e7b7e033b: Loading layer  131.1kB/12.6MB

ea9e7b7e033b: Loading layer  8.782MB/12.6MB

ea9e7b7e033b: Loading layer  9.961MB/12.6MB

ea9e7b7e033b: Loading layer  11.01MB/12.6MB

ea9e7b7e033b: Loading layer  12.06MB/12.6MB

ea9e7b7e033b: Loading layer   12.6MB/12.6MB


cb95f37e4c4f: Loading layer  5.632kB/5.632kB

cb95f37e4c4f: Loading layer  5.632kB/5.632kB


404eabde7586: Loading layer  3.072kB/3.072kB

404eabde7586: Loading layer  3.072kB/3.072kB


b9a58796eb02: Loading layer   2.56kB/2.56kB

b9a58796eb02: Loading layer   2.56kB/2.56kB


647ecf8d498b: Loading layer  4.608kB/4.608kB

647ecf8d498b: Loading layer  4.608kB/4.608kB


f87e509bd97c: Loading layer  3.584kB/3.584kB

f87e509bd97c: Loading layer  3.584kB/3.584kB


177e8eb623a1: Loading layer   2.56kB/2.56kB

177e8eb623a1: Loading layer   2.56kB/2.56kB


1742135c4fb0: Loading layer  3.584kB/3.584kB

1742135c4fb0: Loading layer  3.584kB/3.584kB


9e3196da42e9: Loading layer  4.608kB/4.608kB

9e3196da42e9: Loading layer  4.608kB/4.608kB
Loaded image: docker4x/shell-aws:17.06.1-ce-aws1
 [ ok ]
 * Setup SSH ...sshkey
ssh-keygen: generating new host keys: RSA DSA ECDSA ED25519 
9b1697ab42c18071c7f813459a457d6808daba164b595402f4bdcd5435563909
 [ ok ]
 * docker: waiting for aws (50 seconds)
 * docker: waiting for aws (41 seconds)
 * docker: waiting for aws (32 seconds)
 * docker: waiting for aws (23 seconds)
 * docker: waiting for aws (14 seconds)
 * docker: waiting for aws (5 seconds)
 * Stopping docker
 * Starting Docker ... [ ok ]
Unable to find image 'docker4x/init-aws:17.06.1-ce-aws1' locally
17.06.1-ce-aws1: Pulling from docker4x/init-aws


019300c8a437: Pulling fs layer 


80d61f2126eb: Pulling fs layer 


b97447dfe678: Pulling fs layer 


7057c11ec110: Pulling fs layer 


7e2e60e666da: Pulling fs layer 


440757970fa6: Pulling fs layer 

7057c11ec110: Waiting 

7e2e60e666da: Waiting 

440757970fa6: Waiting 

80d61f2126eb: Downloading  311.3kB/30.31MB

b97447dfe678: Downloading     131B/131B

b97447dfe678: Verifying Checksum 

b97447dfe678: Download complete 

019300c8a437: Downloading  32.26kB/1.97MB

80d61f2126eb: Downloading  4.669MB/30.31MB

019300c8a437: Downloading  457.9kB/1.97MB

80d61f2126eb: Downloading  9.961MB/30.31MB

019300c8a437: Downloading  916.6kB/1.97MB

80d61f2126eb: Downloading  14.94MB/30.31MB

019300c8a437: Downloading  1.408MB/1.97MB

80d61f2126eb: Downloading  19.92MB/30.31MB

7057c11ec110: Downloading  2.673kB/2.673kB

7057c11ec110: Verifying Checksum 

7057c11ec110: Download complete 

019300c8a437: Downloading  1.867MB/1.97MB

80d61f2126eb: Downloading  24.28MB/30.31MB

019300c8a437: Verifying Checksum 

019300c8a437: Download complete 

019300c8a437: Extracting  32.77kB/1.97MB

80d61f2126eb: Downloading  28.02MB/30.31MB

80d61f2126eb: Verifying Checksum 

80d61f2126eb: Download complete 

019300c8a437: Extracting  360.4kB/1.97MB

019300c8a437: Extracting  1.966MB/1.97MB

019300c8a437: Extracting   1.97MB/1.97MB

019300c8a437: Pull complete 

80d61f2126eb: Extracting  327.7kB/30.31MB

80d61f2126eb: Extracting  4.915MB/30.31MB

80d61f2126eb: Extracting   8.52MB/30.31MB

80d61f2126eb: Extracting  10.49MB/30.31MB

80d61f2126eb: Extracting   11.8MB/30.31MB

80d61f2126eb: Extracting  13.11MB/30.31MB

80d61f2126eb: Extracting  14.42MB/30.31MB

80d61f2126eb: Extracting  16.06MB/30.31MB

80d61f2126eb: Extracting  17.69MB/30.31MB

80d61f2126eb: Extracting  18.68MB/30.31MB

80d61f2126eb: Extracting  19.99MB/30.31MB

440757970fa6: Downloading  32.77kB/2.054MB

7e2e60e666da: Downloading  32.77kB/2.051MB

80d61f2126eb: Extracting  20.32MB/30.31MB

80d61f2126eb: Extracting   21.3MB/30.31MB

80d61f2126eb: Extracting  22.61MB/30.31MB

7e2e60e666da: Downloading  196.6kB/2.051MB

440757970fa6: Downloading  195.5kB/2.054MB

440757970fa6: Downloading  2.054MB/2.054MB

7e2e60e666da: Verifying Checksum 

7e2e60e666da: Download complete 

440757970fa6: Verifying Checksum 

440757970fa6: Download complete 

80d61f2126eb: Extracting  23.27MB/30.31MB

80d61f2126eb: Extracting  24.25MB/30.31MB

80d61f2126eb: Extracting  25.56MB/30.31MB

80d61f2126eb: Extracting  26.54MB/30.31MB

80d61f2126eb: Extracting   27.2MB/30.31MB

80d61f2126eb: Extracting  28.18MB/30.31MB

80d61f2126eb: Extracting  29.16MB/30.31MB

80d61f2126eb: Extracting  29.82MB/30.31MB

80d61f2126eb: Extracting  30.15MB/30.31MB

80d61f2126eb: Extracting  30.31MB/30.31MB

80d61f2126eb: Pull complete 

b97447dfe678: Extracting     131B/131B

b97447dfe678: Extracting     131B/131B

b97447dfe678: Pull complete 

7057c11ec110: Extracting  2.673kB/2.673kB

7057c11ec110: Extracting  2.673kB/2.673kB

7057c11ec110: Pull complete 

7e2e60e666da: Extracting  32.77kB/2.051MB

7e2e60e666da: Extracting  1.671MB/2.051MB

7e2e60e666da: Extracting  2.051MB/2.051MB

7e2e60e666da: Pull complete 

440757970fa6: Extracting  32.77kB/2.054MB

440757970fa6: Extracting  1.573MB/2.054MB

440757970fa6: Extracting  2.054MB/2.054MB

440757970fa6: Extracting  2.054MB/2.054MB

440757970fa6: Pull complete 
Digest: sha256:20094658a6e20552de80c0832588befa01f22ce4b78edda9032f697699270334
Status: Downloaded newer image for docker4x/init-aws:17.06.1-ce-aws1
59ea92dd8568de5d84ea1083b3d67f152e33d89c0e7fc64fada06f777cae0d47
Unable to find image 'docker4x/guide-aws:17.06.1-ce-aws1' locally
17.06.1-ce-aws1: Pulling from docker4x/guide-aws


019300c8a437: Already exists 


d8a5426540a1: Pulling fs layer 


1c1a8034f59d: Pulling fs layer 


46decec61d7f: Pulling fs layer 


9959606dca33: Pulling fs layer 


b8531578aa13: Pulling fs layer 


0740adce4a32: Pulling fs layer 


f48230638940: Pulling fs layer 


05745acfffca: Pulling fs layer 


4bc4dcaa78d0: Pulling fs layer 


42f520538782: Pulling fs layer 


35fc73cf630a: Pulling fs layer 


c95bf13721a6: Pulling fs layer 


2be84f664045: Pulling fs layer 

9959606dca33: Waiting 

b8531578aa13: Waiting 

0740adce4a32: Waiting 

f48230638940: Waiting 

05745acfffca: Waiting 

4bc4dcaa78d0: Waiting 

42f520538782: Waiting 

35fc73cf630a: Waiting 

c95bf13721a6: Waiting 

2be84f664045: Waiting 

1c1a8034f59d: Downloading     155B/155B

1c1a8034f59d: Verifying Checksum 

1c1a8034f59d: Download complete 

46decec61d7f: Downloading     274B/274B

46decec61d7f: Verifying Checksum 

46decec61d7f: Download complete 

d8a5426540a1: Downloading  311.3kB/30.62MB

d8a5426540a1: Downloading  5.915MB/30.62MB

d8a5426540a1: Downloading  11.52MB/30.62MB

d8a5426540a1: Downloading   16.5MB/30.62MB

9959606dca33: Downloading  3.596kB/5.024kB

9959606dca33: Downloading  5.024kB/5.024kB

9959606dca33: Verifying Checksum 

9959606dca33: Download complete 

d8a5426540a1: Downloading  21.17MB/30.62MB

b8531578aa13: Downloading  1.187kB/1.187kB

b8531578aa13: Verifying Checksum 

b8531578aa13: Download complete 

d8a5426540a1: Downloading  28.02MB/30.62MB

d8a5426540a1: Verifying Checksum 

d8a5426540a1: Download complete 

d8a5426540a1: Extracting  327.7kB/30.62MB

d8a5426540a1: Extracting  5.243MB/30.62MB

d8a5426540a1: Extracting   8.52MB/30.62MB

d8a5426540a1: Extracting  10.49MB/30.62MB

d8a5426540a1: Extracting  12.12MB/30.62MB

d8a5426540a1: Extracting  13.43MB/30.62MB

d8a5426540a1: Extracting  14.09MB/30.62MB

d8a5426540a1: Extracting  16.06MB/30.62MB

d8a5426540a1: Extracting  17.04MB/30.62MB

d8a5426540a1: Extracting  18.35MB/30.62MB

f48230638940: Downloading     717B/717B

f48230638940: Verifying Checksum 

f48230638940: Download complete 

d8a5426540a1: Extracting  19.33MB/30.62MB

d8a5426540a1: Extracting  20.32MB/30.62MB

0740adce4a32: Downloading     700B/700B

0740adce4a32: Verifying Checksum 

0740adce4a32: Download complete 

d8a5426540a1: Extracting  20.64MB/30.62MB

d8a5426540a1: Extracting  21.63MB/30.62MB

d8a5426540a1: Extracting  22.94MB/30.62MB

d8a5426540a1: Extracting  23.92MB/30.62MB

d8a5426540a1: Extracting  25.23MB/30.62MB

d8a5426540a1: Extracting  26.21MB/30.62MB

d8a5426540a1: Extracting   27.2MB/30.62MB

d8a5426540a1: Extracting  27.85MB/30.62MB

d8a5426540a1: Extracting  28.84MB/30.62MB

d8a5426540a1: Extracting  29.49MB/30.62MB

05745acfffca: Downloading     357B/357B

05745acfffca: Verifying Checksum 

05745acfffca: Download complete 

d8a5426540a1: Extracting  30.47MB/30.62MB

d8a5426540a1: Extracting  30.62MB/30.62MB

4bc4dcaa78d0: Downloading  1.258kB/1.258kB

4bc4dcaa78d0: Verifying Checksum 

4bc4dcaa78d0: Download complete 

42f520538782: Downloading  32.77kB/2.051MB

d8a5426540a1: Pull complete 

1c1a8034f59d: Extracting     155B/155B

1c1a8034f59d: Extracting     155B/155B

42f520538782: Verifying Checksum 

42f520538782: Download complete 

1c1a8034f59d: Pull complete 

46decec61d7f: Extracting     274B/274B

46decec61d7f: Extracting     274B/274B

46decec61d7f: Pull complete 

9959606dca33: Extracting  5.024kB/5.024kB

9959606dca33: Extracting  5.024kB/5.024kB

35fc73cf630a: Downloading     354B/354B

35fc73cf630a: Verifying Checksum 

35fc73cf630a: Download complete 

9959606dca33: Pull complete 

b8531578aa13: Extracting  1.187kB/1.187kB

b8531578aa13: Extracting  1.187kB/1.187kB

b8531578aa13: Pull complete 

0740adce4a32: Extracting     700B/700B

0740adce4a32: Extracting     700B/700B

0740adce4a32: Pull complete 

f48230638940: Extracting     717B/717B

f48230638940: Extracting     717B/717B

c95bf13721a6: Downloading  32.77kB/2.059MB

f48230638940: Pull complete 

c95bf13721a6: Verifying Checksum 

c95bf13721a6: Download complete 

05745acfffca: Extracting     357B/357B

05745acfffca: Extracting     357B/357B

05745acfffca: Pull complete 

4bc4dcaa78d0: Extracting  1.258kB/1.258kB

4bc4dcaa78d0: Extracting  1.258kB/1.258kB

2be84f664045: Downloading     303B/303B

2be84f664045: Verifying Checksum 

2be84f664045: Download complete 

4bc4dcaa78d0: Pull complete 

42f520538782: Extracting  32.77kB/2.051MB

42f520538782: Extracting  1.606MB/2.051MB

42f520538782: Extracting  2.051MB/2.051MB

42f520538782: Pull complete 

35fc73cf630a: Extracting     354B/354B

35fc73cf630a: Extracting     354B/354B

35fc73cf630a: Pull complete 

c95bf13721a6: Extracting  32.77kB/2.059MB

c95bf13721a6: Extracting  1.606MB/2.059MB

c95bf13721a6: Extracting  2.059MB/2.059MB

c95bf13721a6: Extracting  2.059MB/2.059MB

c95bf13721a6: Pull complete 

2be84f664045: Extracting     303B/303B

2be84f664045: Extracting     303B/303B

2be84f664045: Pull complete 
Digest: sha256:c0f0b28d5f4ac3da3db03f1addb0e15171a0a33435529570fac759600338a40a
Status: Downloaded newer image for docker4x/guide-aws:17.06.1-ce-aws1
2941965d78ea4b2e1be29fba6fecf34fd162b0ad9364f04d97b50cdd22682535
Unable to find image 'docker4x/meta-aws:17.06.1-ce-aws1' locally
17.06.1-ce-aws1: Pulling from docker4x/meta-aws


019300c8a437: Already exists 


8835955ffc8b: Pulling fs layer 


4420fa30b71d: Pulling fs layer 


21084da2cbed: Pulling fs layer 

21084da2cbed: Downloading   45.8kB/3.104MB

21084da2cbed: Verifying Checksum 

21084da2cbed: Download complete 

4420fa30b71d: Downloading  32.31kB/3.104MB

8835955ffc8b: Downloading  16.38kB/1.379MB

4420fa30b71d: Verifying Checksum 

4420fa30b71d: Download complete 

8835955ffc8b: Verifying Checksum 

8835955ffc8b: Download complete 

8835955ffc8b: Extracting  32.77kB/1.379MB

8835955ffc8b: Extracting  393.2kB/1.379MB

8835955ffc8b: Extracting  1.379MB/1.379MB

8835955ffc8b: Pull complete 

4420fa30b71d: Extracting  32.77kB/3.104MB

4420fa30b71d: Extracting   1.54MB/3.104MB

4420fa30b71d: Extracting  3.104MB/3.104MB

4420fa30b71d: Pull complete 

21084da2cbed: Extracting  32.77kB/3.104MB

21084da2cbed: Extracting  1.311MB/3.104MB

21084da2cbed: Extracting  2.884MB/3.104MB

21084da2cbed: Extracting  3.104MB/3.104MB

21084da2cbed: Pull complete 
Digest: sha256:c15cb3556dd55343238eb5603c77908bb6ad5ca1804c271e2c6916b33cec0bc1
Status: Downloaded newer image for docker4x/meta-aws:17.06.1-ce-aws1
30e4e55415c66d49c31449164376f3265746fd6da1062844a724e5fddc98727d
Unable to find image 'docker4x/l4controller-aws:17.06.1-ce-aws1' locally
17.06.1-ce-aws1: Pulling from docker4x/l4controller-aws


019300c8a437: Already exists 


d33a51ac2f71: Pulling fs layer 


7a9898cfa178: Pulling fs layer 

7a9898cfa178: Downloading  51.14kB/3.616MB

d33a51ac2f71: Downloading  16.38kB/1.379MB

7a9898cfa178: Downloading  3.616MB/3.616MB

7a9898cfa178: Verifying Checksum 

7a9898cfa178: Download complete 

d33a51ac2f71: Verifying Checksum 

d33a51ac2f71: Download complete 

d33a51ac2f71: Extracting  32.77kB/1.379MB

d33a51ac2f71: Extracting  1.379MB/1.379MB

d33a51ac2f71: Pull complete 

7a9898cfa178: Extracting  65.54kB/3.616MB

7a9898cfa178: Extracting  1.769MB/3.616MB

7a9898cfa178: Extracting  3.539MB/3.616MB

7a9898cfa178: Extracting  3.616MB/3.616MB

7a9898cfa178: Pull complete 
Digest: sha256:d9e5879d0fce5540c3ec8b488c6a3e9855fbfd71945badd9119a9a4dd372d4ba
Status: Downloaded newer image for docker4x/l4controller-aws:17.06.1-ce-aws1
591ee4bef9174c3feb575f66af07bc06f316453853c01f3ac3f6dac2b2995943
17.06.1-ce-aws1: Pulling from docker4x/cloudstor


d96356d3bebc: Downloading  179.1kB/17.69MB

d96356d3bebc: Downloading  6.667MB/17.69MB

d96356d3bebc: Downloading  13.34MB/17.69MB

d96356d3bebc: Downloading  17.69MB/17.69MB

d96356d3bebc: Verifying Checksum 

d96356d3bebc: Download complete 
Digest: sha256:e68672569d7c6c916bddc84eb12c15d2ca3a590b1ba5d590bc16aa26755eef09
Status: Downloaded newer image for docker4x/cloudstor:17.06.1-ce-aws1
Installed plugin docker4x/cloudstor:17.06.1-ce-aws1
 [ ok ]
 * Starting chronyd ... [ ok ]
? Drive found: xvdb
? Drive mounted: /dev/xvdb1 on /var type ext4 (rw,relatime,data=ordered)
? Network connected:           inet addr:10.2.1.141  Bcast:10.2.1.255  Mask:255.255.255.0
? Process dockerd running: dockerd --pidfile=/run/docker.pid -H unix:///var/run/docker.sock --debug --storage-driver overlay2
? Process containerd running: docker-containerd -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --shim docker-containerd-shim --runtime docker-runc --debug
docker-containerd-shim 9b1697ab42c18071c7f813459a457d6808daba164b595402f4bdcd5435563909 /var/run/docker/libcontainerd/9b1697ab42c18071c7f813459a457d6808daba164b595402f4bdcd5435563909 docker-runc
docker-containerd-shim 59ea92dd8568de5d84ea1083b3d67f152e33d89c0e7fc64fada06f777cae0d47 /var/run/docker/libcontainerd/59ea92dd8568de5d84ea1083b3d67f152e33d89c0e7fc64fada06f777cae0d47 docker-runc
docker-containerd-shim 2941965d78ea4b2e1be29fba6fecf34fd162b0ad9364f04d97b50cdd22682535 /var/run/docker/libcontainerd/2941965d78ea4b2e1be29fba6fecf34fd162b0ad9364f04d97b50cdd22682535 docker-runc
docker-containerd-shim 30e4e55415c66d49c31449164376f3265746fd6da1062844a724e5fddc98727d /var/run/docker/libcontainerd/30e4e55415c66d49c31449164376f3265746fd6da1062844a724e5fddc98727d docker-runc
docker-containerd-shim 591ee4bef9174c3feb575f66af07bc06f316453853c01f3ac3f6dac2b2995943 /var/run/docker/libcontainerd/591ee4bef9174c3feb575f66af07bc06f316453853c01f3ac3f6dac2b2995943 docker-runc
docker-containerd-shim 2c4c0e8f0336310375af789412c492a688e52f662a8621a72df2be88ec8e66fa /var/run/docker/libcontainerd/2c4c0e8f0336310375af789412c492a688e52f662a8621a72df2be88ec8e66fa docker-runc
? Docker daemon working
? Docker daemon version: 17.06.1-ce
? Diagnostics server running: /usr/bin/diagnostics-server -http
? System containerd server running: /usr/bin/containerd
? System containerd working
 * Starting Hyper-V daemon: hv_kvp_daemon ... [ ok ]
 * Starting Hyper-V daemon: hv_vss_daemon ... [ ok ]
 * Adjusting oom killer settings ... [ ok ]

Welcome to Moby

                        ##         .
                  ## ## ##        ==
               ## ## ## ## ##    ===
           /"""""""""""""""""\___/ ===
      ~~~ {~~ ~~~~ ~~~ ~~~~ ~~~ ~ /  ===- ~~~
           \______ o           __/
             \    \         __/
              \____\_______/

/ # 
RehanSaeed commented 7 years ago

It turns out that this was due to a container sucking up node resources i.e. when a node is under too much load, it looks like it's offline, so Docker for AWS terminates the EC2 instance. I since increased the size of my nodes. Is there some timeout values that you could increase to avoid this problem?