docker-archive / for-aws

92 stars 26 forks source link

Swarm Mode Failing to Initialize & Nodes Failing to Join Cluster #92

Closed RehanSaeed closed 7 years ago

RehanSaeed commented 7 years ago

Expected behavior

Start or upgrade a new swarm, all nodes should join the cluster. Also, docker-diagnose should return a session ID.

Actual behavior

Randomly 1 or 2 of my 3 nodes fails to initialize swarm mode. Also docker-diagnose returns nothing due to a timeout. I think both issues may be related.

Information

Copied from https://github.com/docker/for-aws/issues/85:

I confirmed that the meta-aws container is running on all three of my nodes. When I curl {IP Address}, I do get a response ourputting /token. When I curl {IP Address}:9024/token/manager/ I get an error:

~ $ curl 10.2.0.209:9024/token/manager/
curl: (52) Empty reply from server
~ $ curl 10.2.1.232:9024/token/manager/
curl: (52) Empty reply from server
~ $ curl 10.2.2.187:9024/token/manager/
curl: (52) Empty reply from server

Unfortunately, for the swarm initialization failure issue, 10.2.0.209 got blown away due to issue https://github.com/docker/for-aws/issues/52. I have three new nodes which joined the swarm correctly but still return curl: (52) Empty reply from server when I curl that endpoint. Here are the system logs for one of my nodes, I have also attached the syslog syslog.txt

:

Linux version 4.9.36-moy (root@11fbdc1f630f) (gcc version 6.2.1 20160822 (Alpine 6.2.1) ) #1 SMP Tue Jul 11 02:00:07 UTC 2017
Command line: BOOT_IMAGE=/vmlinuz64 root=/dev/xvdb1 console=ty0 console=tty1 console=ttyS0 mobyplatform=aws vsyscall=emulate page_poison=1 initrd=/initrd.img
x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
x86/fpu: Using 'eager' FPU context switches.
e820: BIOS-provided physical RAM map:
BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable
BIOS-e820: [mem 0x000000000009e000-0x000000000009ffff] reserved
BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
BIOS-e820: [mem 0x0000000000100000-0x000000007fffffff] usable
BIOS-e820: [mem 0x00000000fc000000-0x00000000ffffffff] reserved
NX (Execute Disable) protection: active
SMBIOS 2.4 present.
Hypervisor detected: Xen
Xen version 4.2.
Netfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated NICs.
Blkfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated disks.
You might have to change the root device
from /dev/hd[a-d] to /dev/xvd[a-d]
in your root= kernel command line option
e820: last_pfn = 0x80000 max_arch_pfn = 0x400000000
x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- WT  
found SMP MP-table at [mem 0x000fbc20-0x000fbc2f] mapped at [ffff9f76400fbc20]
RAMDISK: [mem 0x7c876000-0x7fffffff]
ACPI: Early table checksum verification disabled
ACPI: RSDP 0x00000000000EA020 000024 (v02 Xen   )
ACPI: XSDT 0x00000000FC00DDC0 000054 (v01 Xen    HVM      00000000 HVML 00000000)
ACPI: FACP 0x00000000FC00DA80 0000F4 (v04 Xen    HVM      00000000 HVML 00000000)
ACPI: DSDT 0x00000000FC001CE0 00BD19 (v02 Xen    HVM      00000000 INTL 20090123)
ACPI: FACS 0x00000000FC001CA0 000040
ACPI: FACS 0x00000000FC001CA0 000040
ACPI: APIC 0x00000000FC00DB80 0000D8 (v02 Xen    HVM      00000000 HVML 00000000)
ACPI: HPET 0x00000000FC00DCD0 000038 (v01 Xen    HVM      00000000 HVML 00000000)
ACPI: WAET 0x00000000FC00DD10 000028 (v01 Xen    HVM      00000000 HVML 00000000)
ACPI: SSDT 0x00000000FC00DD40 000031 (v02 Xen    HVM      00000000 INTL 20090123)
ACPI: SSDT 0x00000000FC00DD80 000031 (v02 Xen    HVM      00000000 INTL 20090123)
Zone ranges:
  DMA      [mem 0x0000000000001000-0x0000000000ffffff]
  DMA32    [mem 0x0000000001000000-0x000000007fffffff]
  Normal   empty
Movable zone start for each node
Early memory node ranges
  node   0: [mem 0x0000000000001000-0x000000000009dfff]
  node   0: [mem 0x0000000000100000-0x000000007fffffff]
Initmem setup node 0 [mem 0x0000000000001000-0x000000007fffffff]
ACPI: PM-Timer IO Port: 0xb008
IOAPIC[0]: apic_id 1, version 17, address 0xfec00000, GSI 0-47
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 low level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 low level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 low level)
Using ACPI (MADT) for SMP configuration information
ACPI: HPET id: 0x8086a201 base: 0xfed00000
smpboot: Allowing 15 CPUs, 14 hotplug CPUs
e820: [mem 0x80000000-0xfbffffff] available for PCI devices
Booting paravirtualized kernel on Xen HVM
clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
setup_percpu: NR_CPUS:128 nr_cpumask_bits:128 nr_cpu_ids:15 nr_node_ids:1
percpu: Embedded 35 pages/cpu @ffff9f76bc400000 s105176 r8192 d29992 u262144
PV qspinlock hash table entries: 256 (order: 0, 4096 bytes)
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 515976
Kernel command line: BOOT_IMAGE=/vmlinuz64 root=/dev/xvdb1 console=tty0 conole=tty1 console=ttyS0 mobyplatform=aws vsyscall=emulate page_poison=1 initrd=/initrd.img
PID hash table entries: 4096 (order: 3, 32768 bytes)
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
Memory: 1983976K/2096756K available (837K kernel code, 1409K rwdata, 2836K rodata, 1388K init, 600K bss, 112780K reserved, 0K cma-reserved)
Hierarchical RCU implementation.
    Build-time adjustment of leaf fanout to 64.
    RCU restricting CPUs from NR_CPUS=128 to nr_cpu_ids=15.
RCU: Adjusting geometry for rcu_fanout_leaf=64, nr_cpu_ids=15
NR_IRQS:8448 nr_irqs:952 16
xen:events: Using 2-level ABI
xen:events: Xen HVM callback vector for event delivery is enabled
Console: colour VGA+ 80x25
console [tty0] enabled
Cannot get hvm parameter CONSOLE_EVTCHN (18): -22!
console [ttyS0] enabled
allocated 4194304 bytes of page_ext
clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 30580167144 ns
tsc: Fast TSC calibration using PIT
tsc: Detected 2400.142 MHz processor
Calibrating dely loop (skipped), value calculated using timer frequency.. 4800.08 BogoMIPS (lpj=24000420)
pid_max: default: 32768 minimum: 301
ACPI: Core revision 20160831
ACPI: 3 ACPI AML tables successfully acquired and loaded
Security Framework initialized
Yama: becoming mindful.
Mount-cache hash table entries: 4096 (order: 3, 32768 bytes)
Mountpoint-cache hash table entries: 4096 (order: 3, 32768 bytes)
CPU: Physical Processor ID: 0
Last level iTLB entries: 4KB 1024, 2MB 1024, 4MB 1024
Last level dTLB entries: 4KB 1024, 2MB 1024, 4MB 1024, 1GB 4
ftrace: allocating 37157 entries in 146 pages
smpboot: Max logical packages: 15
Switched APIC routing to physical flat.
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=0 pin2=0
clocksource: xen: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
installing Xen timer for CPU 0
smpboot: CPU0: Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz (family: 0x6, model: 0x3f, stepping: 0x2)
cpu 0 spinlock event irq 53
Performance Events: unsupported p6 CPU model 63 no PMU driver, software events only.
NMI watchdog: disabled (cpu0): hardware events not enabled
NMI watchdog: Shutting down hard lockup detector on all cpus
x86: Booted up 1 node, 1 CPUs
smpboot: Total of 1 processors activated (4800.08 BogoMIPS)
devtmpfs: initialized
x86/mm: Memory block size: 128MB
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
futex hash table entries: 4096 (order: 6, 262144 bytes)
NET: Registered protocol family 16
cpuidle: using governor ladder
cpuidle: using governor menu
ACPI: bus type PCI registered
PCI: Using configuration type 1 for base access
HugeTLB registered 2 MB page size, pre-allocated 0 pages
ACPI: Added _OSI(Module Device)
ACPI: Added _OSI(Processor Device)
ACPI: Added _OSI(3.0 _SCP Extensions)
ACPI: Added _OSI(Processor Aggregator Device)
ACPI: Interpreter enabled
ACPI: (supports S0 S5)
ACPI: Using IOAPIC for interrupt routing
PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
acpi PNP0A03:00: _OSC: OS supports [ASPM ClockPM Segments MSI]
acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM
acpi PNP0A03:00 fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge.
PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7 window]
pci_bus 0000:00: root bus resource [io  0x0d00-0xffff window]
pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
pci_bus 0000:00: root bus resource [mem 0xf0000000-0xfbffffff window]
pci_bus 0000:00: root bus resource [bus 00-ff]
pci 0000:00:01.1: legacy IDE quirk: reg 0x10: [io  0x01f0-0x01f7]
pci 0000:00:01.1: legacy IDE quirk: reg 0x14: [io  0x03f6]
pci 0000:00:01.1: legacy IDE quirk: reg 0x18: [io  0x0170-0x0177]
pci 0000:00:01.1: legacy IDE quirk: reg 0x1c: [io  0x0376]
* Found PM-Timer Bug on the chipset. Due to workarounds for a bug,
* this clock source is slow. Consider trying other clock sources
pci 0000:00:01.3: quirk: [io  0xb000-0xb03f] claimed by PIIX4 ACPI
ACPI: PCI Interrupt Link [LNKA] (IRQs *5 10 11)
ACPI: PCI Interrupt Link [LNKB] (IRQs 5 *10 11)
ACPI: PCI Interrupt Link [LNKC] (IRQs 5 10 *11)
ACPI: PCI Interrupt Link [LNKD] (IRQs *5 10 11)
ACPI: Enabled 2 GPEs in block 00 to 0F
xen:balloon: Initialising balloon driver
xen_balloon: Initialising balloon driver
SCSI subsystem initialized
pps_core: LinuxPPS API ver. 1 registered
pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
PTP clock support registered
wmi: Mapper loaded
PCI: Using ACPI for IRQ routing
NetLabel: Initializing
NetLabel:  domain hash size = 128
NetLabel:  protocols = UNLABELED CIPSOv4
NetLabel:  unlabeled traffic allowed by default
HPET: 3 timers in total, 0 timers will be used for per-cpu timer
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
hpet0: 3 comparators, 64-bit 62.500000 MHz counter
clocksource: Switched to clocksource xen
FS-Cache: Loaded
CacheFiles: Loaded
pnp: PnP ACPI init
system 00:00: [mem 0x00000000-0x0009ffff] could not be reserved
system 00:01: [io  0x08a0-0x08a3] has been reserved
system 00:01: [io  0x0cc0-0x0ccf] has been reserved
system 00:01: [io  0x04d0-0x04d1] has been reserved
system 00:07: [io  0x10c0-0x1141] has been reserved
system 00:07: [io  0xb044-0xb047] has been reserved
pnp: PnP ACPI: found 8 devices
clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
NET: Registered protocol family 2
TCP established hash table entries: 16384 (order: 5, 131072 bytes)
TCP bind hash table entries: 16384 (order: 6, 262144 bytes)
TCP: Hash tables configured (established 16384 bind 16384)
UDP hash table entries: 1024 (order: 3, 32768 bytes)
UDP-Lite hash table entries: 1024 (order: 3, 32768 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
pci 0000:00:00.0: Limiting direct PCI/PCI transfers
pci 0000:00:01.0: PIIX3: Enabling Passive Release
pci 0000:00:01.0: Activating ISA DMA hang workarounds
pci 0000:00:02.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
Unpacking initramfs...
Freeing initrd memory: 56872K (ffff9f76bc876000 - ffff9f76c0000000)
RAPL PMU: API unit is 2^-32 Joules, 4 fixed counters, 655360 ms ovfl timer
RAPL PMU: hw unit of domain pp0-core 2^-14 Joules
RAPL PMU: hw unit of domain package 2^-14 Joules
RAPL PMU: hw unit of domain dram 2^-14 Joules
RAPL PMU: hw unit of domain pp1-gpu 2^-14 Joules
audit: initializing netlink subsys (disabled)
audit: type=2000 audit(1502888853.299:1): initialized
workingset: timestamp_bits=46 max_order=19 bucket_order=0
FS-Cache: Netfs 'nfs' registered for caching
NFS: Registering the id_resolver key type
Key type id_resolver registered
Key type id_legacy registered
nfs4filelayout_init: NFSv4 File Layout Driver Registering...
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
FS-Cache: Netfs 'cifs' registered for caching
ntfs: driver 2.1.32 [Flags: R/O].
fuse init (API version 7.26)
9p: Installing v9fs 9p2000 file system support
FS-Cache: Netfs '9p' registered for caching
NET: Registered protocol family 38
Key type asymmetric registered
Asymmetric key parser 'x509' registered
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 251)
io scheduler noop registered
io scheduler deadline registered (default)
io scheduler cfq registered
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
pciehp: PCI Express Hot Plug Controller Driver version: 0.4
shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
hv_vmbus: registering driver hyperv_fb
input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0
ACPI: Power Button [PWRF]
input: Sleep Button as /devices/LNXSYSTM:00/LNXSLPBN:00/input/input1
ACPI: Sleep Button [SLPF]
GHES: HEST is not enabled!
xen:xen_evtchn: Event-channel device installed
xen:grant_table: Grant tables using version 1 layout
Grant table initialized
Cannot get hvm parameter CONSOLE_EVTCHN (18): -22!
Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
00:06: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
Initializing Nozomi driver 2.1d
Non-volatile memory driver v1.3
Hangcheck: starting hangcheck timer 0.9.1 (tick is 180 seconds, margin is 60 seconds).
loop: module loaded
nbd: registered device at major 43
Invalid max_queues (4), will use default max: 1.
VMware PVSCSI driver - version 1.0.7.0-k
hv_vmbus: registering driver hv_storvsc
random: fast init done
scsi host0: ata_piix
scsi host1: ata_piix
ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc100 irq 14
ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc108 irq 15
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
tun: Universal TUN/TAP device driver, 1.6
tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
e1000: Copyright (c) 1999-2006 Intel Corporation.
e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
ixgbevf: Intel(R) 10 Gigabit PCI Express Virtual Function Network Driver - version 3.2.2-k
ixgbevf: Copyright (c) 2009 - 2015 Intel Corporation.
PPP generic driver version 2.4.2
PPP BSD Compression module registered
PPP Deflate Compression module registered
PPP MPPE Compression module registered
NET: Registered protocol family 24
PPTP driver version 0.8.5
VMware vmxnet3 virtual NIC driver - version 1.4.a.0-k-NAPI
xen_netfront: Initialising Xen virtual ethernet driver
blkfront: xvdb: barrier or flush: disabled; persistent grants: disabled; indirect descriptors: enabled;
hv_vmbus: registering driver hv_netvsc
Fusion MPT base driver 3.04.20
Copyright (c) 1999-2008 LSI Corporation
Fusion MPT SPI Host driver 3.04.20
aoe: AoE v85 initialised.
i8042: PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
 xvdb: xvdb1
hv_vmbus: registering driver hyperv_keyboard
mousedev: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input2
input: PC Speaker as /devices/platform/pcspkr/input/input3
rtc_cmos 00:02: rtc core: registered rtc_cmos as rtc0
rtc_cmos 00:02: alarms up to one day, 114 bytes nvram, hpet irqs
i2c /dev entries driver
hv_utils: Registering HyperV Utility Driver
hv_vmbus: registering driver hv_util
hv_vmbus: registering driver hv_balloon
oprofile: using timer interrupt.
GACT probability on
Mirror/redirect action on
Simple TC action Loaded
netem: version 1.3
u32 classifier
    Performance counters on
    input device check on
    Actions configured
Netfilter messages via NETLINK v0.30.
nfnl_acct: registering with nfnetlink.
nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
ctnetlink v0.93: registering with nfnetlink.
nf_tables: (c) 2007-2009 Patrick McHardy <kaber@trash.net>
nf_tables_compat: (c) 2012 Pablo Neira Ayuso <pablo@netfilter.org>
xt_time: kernel timezone is -0000
ip_set: protocol 6
IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
IPVS: Connection hash table configured (size=4096, memory=64Kbytes)
IPVS: Creating netns size=2104 id=0
IPVS: ipvs loaded.
IPVS: [rr] scheduler registered.
IPVS: [wrr] scheduler registered.
IPVS: [lc] scheduler registered.
IPVS: [wlc] scheduler registered.
IPVS: [fo] scheduler registered
IPVS: [ovf] scheduler registered.
IPVS: [lblc] scheduler registered.
IPVS: [lblcr] scheduler registered.
IPVS: [dh] scheduler registered.
IPVS: [sh] scheduler registered.
IPVS: [sed] scheduler registered.
IPVS: [nq] scheduler registered.
IPVS: ftp: loaded support on port[0] = 21
ipip: IPv4 and MPLS over IPv4 tunneling driver
gre: GRE over IPv4 demultiplexor driver
ip_gre: GRE over IPv4 tunneling driver
IPv4 over IPsec tunneling driver
ip_tables: (C) 2000-2006 Netfilter Core Team
ipt_CLUSTERIP: ClusterIP Version 0.8 loaded successfully
arp_tables: arp_tables: (C) 2002 David S. Miller
Initializing XFRM netlink socket
NET: Registered protocol family 10
mip6: Mobile IPv6
ip6_tables: (C) 2000-2006 Netfilter Core Team
sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
ip6_gre: GRE over IPv6 tunneling driver
NET: Registered protocol family 17
NET: Registered protocol family 15
Bridge firewalling registered
Ebtables v2.0 registered
l2tp_core: L2TP core driver, V2.0
l2tp_ppp: PPPoL2TP kernel driver, V2.0
8021q: 802.1Q VLAN Support v1.8
9pnet: Installing 9P2000 support
Key type dns_resolver registered
openvswitch: Open vSwitch switching datapath
mpls_gso: MPLS GSO support
microcode: sig=0x306f2, pf=0x1, revision=0x36
microcode: Microcode Update Driver: v2.01 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
AVX2 version of gcm_enc/dec engaged.
AES CTR mode by8 optimization enabled
registered taskstats version 1
Key type big_key registered
Key type encrypted registered
rtc_cmos 00:02: setting system clock to 2017-08-16 13:07:33 UTC (1502888853)
Freeing unused kernel memory: 1388K (ffffffff82f62000 - ffffffff830bd000)
Write protecting the kernel read-only data: 14336k
Freeing unused kernel memory: 1848K (ffff9f7657832000 - ffff9f7657a00000)
Freeing unused kernel memory: 1260K (ffff9f7657cc5000 - ffff9f7657e00000)

   OpenRC 0.21.7.e3f10ac is starting up Linux 4.9.36-moby (x86_64)

 * Mounting /proc ... [ ok ]
 * Mounting /run ... * /run/openrc: creating directory
 * /run/lock: creating directory
 * /run/lock: correcting owner
 * Mounting xenfs ... [ ok ]
 * Caching service dependencies ... [ ok ]
 * Mounting /sys ... [ ok ]
 * Mounting security filesystem ... [ ok ]
 * Mounting debug filesystem ... [ ok ]
 * Mounting fuse control filesystem ... [ ok ]
 * Mounting persistent storage (pstore) filesystem ... [ ok ]
 * Mounting cgroup filesystem ... [ ok ]
 * Mounting devtmpfs on /dev ... [ ok ]
 * Mounting /dev/mqueue ... [ ok ]
 * Mounting /dev/pts ... [ ok ]
 * Mounting /dev/shm ... [ ok ]
 * Starting busybox mdev ... [ ok ]
 * Configuring host block device .../dev/xvdb1: clean, 16/65280 files, 24773/261048 blocks
Resizing disk partition: Unpartitioned space /dev/xvdb: 19 GiB, 20405288960 bytes, 39854080 sectors
The bootable flag on partition 1 is enabled now.

The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
/dev/xvdb1: 16/65280 files (0.0% non-contiguous), 24773/261048 blocks
resize2fs 1.43.3 (04-Sep-2016)
Resizing the filesystem on /dev/xvdb1 to 5242872 (4k) blocks.
The filesystem on /dev/xvdb1 is now 5242872 (4k) blocks long.

/dev/xvdb1: clean, 16/1305600 files, 103371/5242872 blocks
 [ ok ]
 * Loading hardware drivers ...modprobe: module fbcon not found in modules.dep
 [ ok ]
 * Remounting filesystems ... [ ok ]
 * Mounting local filesystems ... [ ok ]
 * Setting hostname ... [ ok ]
 * Creating user login records ... [ ok ]
 * sysklogd -> start: syslogd ... [ ok ]
 * sysklogd -> start: klogd ... [ ok ]
 * Starting busybox crond ... [ ok ]
 * Mounting misc binary format filesystem ... [ ok ]
 * Setting sysfs variables ... [ ok ]
 * Starting local ... [ ok ]
 * Configuring kernel parameters ... [ ok ]
 * Starting DHCP Client Daemon ...eth0: eth0: MTU set to 9001
 [ ok ]
 * Starting diagnostics server ... [ ok ]
 * Starting networking ... *   lo ... [ ok ]
 * Starting busybox acpid ... [ ok ]
 * Running system containerd ... [ ok ]
 * Running system containers ... binfmt rngd [ ok ]
 * Setting up database (AWS) ... [ ok ]
 * Configuring host settings from database ... [ ok ]
 * Starting Docker ... [ ok ]
 * Running AWS-specific initialization ...passwd: password for docker changed by root
 * Pre-load docker images ...ls: /dockerimages/*.tar: No such file or directory
 [ ok ]
 * Setup SSH ...sshkey
Unable to find image 'docker4x/shell-aws:17.06.0-ce-aws2' locally
17.06.0-ce-aws2: Pulling from docker4x/shell-aws
...Omitted
3f481ff52f40: Pull complete 
Digest: sha256:bbd5085e34c496c72e6a4c731c6a4e17103a20cd7bc2f17e48039fb5ee9f016a
Status: Downloaded newer image for docker4x/shell-aws:17.06.0-ce-aws2
ssh-keygen: generating new host keys: RSA DSA ECDSA ED25519 
f35e9a66a0e432382183a4f7b217930858e36577c684a051b4168546f06eb60b
 [ ok ]
sh: yes: unknown operand
 * docker: waiting for aws (50 seconds)
 * docker: waiting for aws (41 seconds)
 * docker: waiting for aws (32 seconds)
 * docker: waiting for aws (23 seconds)
 * docker: waiting for aws (14 seconds)
 * docker: waiting for aws (5 seconds)
 * Stopping docker
 * Starting Docker ... [ ok ]
FrenchBen commented 7 years ago

If you see this sh: yes: unknown operand it means that something didn't get loaded properly on boot. You can check with a docker ps on the managers and there's a chance they didn't have any container deployed beyond the shell.

I believe there's a fix for it in our current build of 17.06.1-CE being tested which should be pushed out to stable soon.

In the meantime, you can try the test release of 17.07 and see if you can replicate the issue: https://editions-us-east-1.s3.amazonaws.com/aws/test/Docker.tmpl

RehanSaeed commented 7 years ago

Thanks for the update, I'll wait for the 17.06.1 bug fix template.

RehanSaeed commented 7 years ago

The new update seems to have fixed this issue but raised others. I've raised https://github.com/docker/for-aws/issues/100