brikis98 / infrastructure-as-code-talk

Sample code for the talk "Infrastructure-as-code: running microservices on AWS with Docker, ECS, and Terraform"
http://www.ybrikman.com/writing/2016/03/31/infrastructure-as-code-microservices-aws-docker-terraform-ecs/
564 stars 164 forks source link

issue when running the default code #6

Closed grm closed 7 years ago

grm commented 7 years ago

Hello,

I adapted your code and tried with yours and i still have a issue i don't understand. The site is not working and i have an issue in the EC2 container management system. In the ecs cluster, The desired count is what i configured, but the running count and the pending count are at 0. In the autoscaling tab, I have this message : "Scalable Target No Auto Scaling resources configured for this service. Click the update button to configure Auto Scaling for tasks"

Also, when looking at the event tab for the service, there is a message "service rails-frontend was unable to place a task because no container instance met all of its requirements. Reason: No Container Instances were found in your cluster. For more information, see the Troubleshooting section."

Do you have any idea how this issue can be solved ?

Thanks, Grm

brikis98 commented 7 years ago

Sounds like no EC2 Instances have registered in your ECS Cluster. The most likely cause if you haven't run any of this before is that you haven't accepted the license agreement for the Amazon ECS Optimized AMI, so if you check the activity tab for the Auto Scaling Group, you'll see that each EC2 Instance it tries to launch fails because of this license issue. To fix it, accept the agreement on the Amazon ECS Optimized AMI page.

If that's not the issue, some other things to check:

  1. Is there an Auto Scaling Group running several EC2 Instances?
  2. Are the EC2 Instances launch successfully?
  3. Do the EC2 Instances configure the proper ECS Cluster to use in User Data?
grm commented 7 years ago

Hello,

Thank you for your help !

I can see that instance are launched in the acticity ab for the autoscaling group :+1: At 2016-10-29T23:09:24Z a user request created an AutoScalingGroup changing the desired capacity from 0 to 1. At 2016-10-29T23:09:35Z an instance was started in response to a difference between desired and actual capacity, increasing the capacity from 0 to 1.

And the task is succesful. I already accepted the license agreement as now I only have access to usage instructions.

Regarding the other points :

  1. There is only one instance inside the autoscaling group, but I think that should still work, no ? (I tried with 5 sooner and the effect were the same). But hen I look at the "Instances" tab in the load balancer configuration, it seems there are no instances as if they were not mapped automatically. Maybe this is normal .. 2.It seems that they are launched successfully, I can see one instance running in the console (instances on the left and its id matches the one running in the autoscaling group
  2. How I can check that ? When looking at the cluster, i can see that there are no instance defined in the "ECS Instances" tab. All I can see is "Add additional ECS Instances using Auto Scaling or Amazon EC2. " message and "no results" printed in the table on the bottom of the ECS Instances tab when looking at the cluster.

Grm

brikis98 commented 7 years ago

There is only one instance inside the autoscaling group, but I think that should still work, no ?

One instance is fine for a quick test. Obviously, you'll want more for high availability if you're going to use this for a production project in the future.

But hen I look at the "Instances" tab in the load balancer configuration, it seems there are no instances as if they were not mapped automatically.

Nothing will show up in the ELB until the ECS Services are deployed. And that won't happen until the EC2 Instances register in the ECS Cluster, so that's the real problem we need to solve.

How I can check that ?

The Terraform code should configure each EC2 Instance in the ASG to register in the ECS Cluster as follows: https://github.com/brikis98/infrastructure-as-code-talk/blob/master/terraform-templates/ecs-cluster.tf#L49

To see if this is happening, go to the EC2 Console, click the checkbox to the left of the EC2 Instance, click the gray "Actions" button, and select "Instance Settings -> View/Change User Data". Let me know what you see.

If the User Data contents look fine, then it's worth checking syslog for any errors. That process is the same, except when you click the "Actions" button, select "Instance Settings -> Get System Log".

grm commented 7 years ago

Hello,

Yes i am currently only trying, and as i don't want to spent too many $, i am spawning only one instance ;)

the user data seems fine to me :

#!/bin/bash
echo "ECS_CLUSTER=nextshop-cluster" >> /etc/ecs/ecs.config
echo "ECS_ENGINE_AUTH_TYPE=dockercfg" >> /etc/ecs/ecs.config
echo "{ auths: { https://index.docker.io/v1/: { auth: XXXXX } } }" >> /etc/ecs/ecs.config

The cluster is indeed called "nextshop-cluster".

As for the logs, i don't see anything that could be wrong (except somme errors, but they might not interfere with the process ..)

[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct
[    0.000000] Linux version 4.4.23-31.54.amzn1.x86_64 (mockbuild@gobi-build-64012) (gcc version 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC) ) #1 SMP Tue Oct 18 22:02:09 UTC 2016
[    0.000000] Command line: root=LABEL=/ console=tty1 console=ttyS0 
[    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[    0.000000] x86/fpu: Supporting XSAVE feature 0x01: 'x87 floating point registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x02: 'SSE registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x04: 'AVX registers'
[    0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
[    0.000000] x86/fpu: Using 'eager' FPU context switches.
[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009e000-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000003fffffff] usable
[    0.000000] BIOS-e820: [mem 0x00000000fc000000-0x00000000ffffffff] reserved
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] SMBIOS 2.4 present.
[    0.000000] Hypervisor detected: Xen
[    0.000000] Xen version 4.2.
[    0.000000] Netfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated NICs.
[    0.000000] Blkfront and the Xen platform PCI driver have been compiled for this kernel: unplug emulated disks.
[    0.000000] You might have to change the root device
[    0.000000] from /dev/hd[a-d] to /dev/xvd[a-d]
[    0.000000] in your root= kernel command line option
[    0.000000] e820: last_pfn = 0x40000 max_arch_pfn = 0x400000000
[    0.000000] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- WT  
[    0.000000] found SMP MP-table at [mem 0x000fbc80-0x000fbc8f] mapped at [ffff8800000fbc80]
[    0.000000] RAMDISK: [mem 0x36ff0000-0x37feffff]
[    0.000000] ACPI: Early table checksum verification disabled
[    0.000000] ACPI: RSDP 0x00000000000EA020 000024 (v02 Xen   )
[    0.000000] ACPI: XSDT 0x00000000FC00DDC0 000054 (v01 Xen    HVM      00000000 HVML 00000000)
[    0.000000] ACPI: FACP 0x00000000FC00DA80 0000F4 (v04 Xen    HVM      00000000 HVML 00000000)
[    0.000000] ACPI: DSDT 0x00000000FC001CE0 00BD19 (v02 Xen    HVM      00000000 INTL 20090123)
[    0.000000] ACPI: FACS 0x00000000FC001CA0 000040
[    0.000000] ACPI: FACS 0x00000000FC001CA0 000040
[    0.000000] ACPI: APIC 0x00000000FC00DB80 0000D8 (v02 Xen    HVM      00000000 HVML 00000000)
[    0.000000] ACPI: HPET 0x00000000FC00DCD0 000038 (v01 Xen    HVM      00000000 HVML 00000000)
[    0.000000] ACPI: WAET 0x00000000FC00DD10 000028 (v01 Xen    HVM      00000000 HVML 00000000)
[    0.000000] ACPI: SSDT 0x00000000FC00DD40 000031 (v02 Xen    HVM      00000000 INTL 20090123)
[    0.000000] ACPI: SSDT 0x00000000FC00DD80 000031 (v02 Xen    HVM      00000000 INTL 20090123)
[    0.000000] No NUMA configuration found
[    0.000000] Faking a node at [mem 0x0000000000000000-0x000000003fffffff]
[    0.000000] NODE_DATA(0) allocated [mem 0x3ffde000-0x3fffffff]
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
[    0.000000]   DMA32    [mem 0x0000000001000000-0x000000003fffffff]
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000001000-0x000000000009dfff]
[    0.000000]   node   0: [mem 0x0000000000100000-0x000000003fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000001000-0x000000003fffffff]
[    0.000000] ACPI: PM-Timer IO Port: 0xb008
[    0.000000] IOAPIC[0]: apic_id 1, version 17, address 0xfec00000, GSI 0-47
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 low level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 low level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 low level)
[    0.000000] Using ACPI (MADT) for SMP configuration information
[    0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000
[    0.000000] smpboot: Allowing 15 CPUs, 14 hotplug CPUs
[    0.000000] e820: [mem 0x40000000-0xfbffffff] available for PCI devices
[    0.000000] Booting paravirtualized kernel on Xen HVM
[    0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
[    0.000000] setup_percpu: NR_CPUS:128 nr_cpumask_bits:128 nr_cpu_ids:15 nr_node_ids:1
[    0.000000] PERCPU: Embedded 32 pages/cpu @ffff88003fc00000 s91416 r8192 d31464 u131072
[    0.000000] PV qspinlock hash table entries: 256 (order: 0, 4096 bytes)
[    0.000000] Built 1 zonelists in Node order, mobility grouping on.  Total pages: 257928
[    0.000000] Policy zone: DMA32
[    0.000000] Kernel command line: root=LABEL=/ console=tty1 console=ttyS0 
[    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
[    0.000000] Memory: 998964K/1048180K available (5004K kernel code, 985K rwdata, 2432K rodata, 1188K init, 1580K bss, 49216K reserved, 0K cma-reserved)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=15, Nodes=1
[    0.000000] Hierarchical RCU implementation.
[    0.000000]  Build-time adjustment of leaf fanout to 64.
[    0.000000]  RCU restricting CPUs from NR_CPUS=128 to nr_cpu_ids=15.
[    0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=64, nr_cpu_ids=15
[    0.000000] NR_IRQS:8448 nr_irqs:952 16
[    0.000000] xen:events: Using 2-level ABI
[    0.000000] xen:events: Xen HVM callback vector for event delivery is enabled
[    0.000000] Console: colour VGA+ 80x25
[    0.000000] console [tty1] enabled
[    0.000000] Cannot get hvm parameter CONSOLE_EVTCHN (18): -22!
[    0.000000] console [ttyS0] enabled
[    0.000000] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 30580167144 ns
[    0.000000] tsc: Detected 2400.074 MHz processor
[    0.008000] Calibrating delay loop (skipped), value calculated using timer frequency.. 4800.14 BogoMIPS (lpj=9600296)
[    0.014022] pid_max: default: 32768 minimum: 301
[    0.016012] ACPI: Core revision 20150930
[    0.025927] ACPI: 3 ACPI AML tables successfully acquired and loaded
[    0.028833] Security Framework initialized
[    0.032088] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
[    0.036186] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
[    0.040088] Mount-cache hash table entries: 2048 (order: 2, 16384 bytes)
[    0.044005] Mountpoint-cache hash table entries: 2048 (order: 2, 16384 bytes)
[    0.048180] Initializing cgroup subsys io
[    0.052007] Initializing cgroup subsys memory
[    0.056011] Initializing cgroup subsys devices
[    0.059202] Initializing cgroup subsys freezer
[    0.060005] Initializing cgroup subsys net_cls
[    0.064003] Initializing cgroup subsys perf_event
[    0.068004] Initializing cgroup subsys net_prio
[    0.070961] Initializing cgroup subsys hugetlb
[    0.072004] Initializing cgroup subsys pids
[    0.076060] CPU: Physical Processor ID: 0
[    0.080737] mce: CPU supports 2 MCE banks
[    0.084023] Last level iTLB entries: 4KB 1024, 2MB 1024, 4MB 1024
[    0.087357] Last level dTLB entries: 4KB 1024, 2MB 1024, 4MB 1024, 1GB 4
[    0.102744] ftrace: allocating 20564 entries in 81 pages
[    0.120647] x2apic: IRQ remapping doesn't support X2APIC mode
[    0.124003] Switched APIC routing to physical flat.
[    0.130032] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=0 pin2=0
[    0.174257] clocksource: xen: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[    0.176012] installing Xen timer for CPU 0
[    0.180046] smpboot: CPU0: Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz (family: 0x6, model: 0x3f, stepping: 0x2)
[    0.185446] cpu 0 spinlock event irq 53
[    0.187824] Performance Events: unsupported p6 CPU model 63 no PMU driver, software events only.
[    0.192029] x86: Booted up 1 node, 1 CPUs
[    0.194277] smpboot: Total of 1 processors activated (4800.14 BogoMIPS)
[    0.196328] devtmpfs: initialized
[    0.199884] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.200203] NET: Registered protocol family 16
[    0.202762] cpuidle: using governor ladder
[    0.204011] cpuidle: using governor menu
[    0.206367] ACPI: bus type PCI registered
[    0.208005] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[    0.211738] PCI: Using configuration type 1 for base access
[    0.213657] ACPI: Added _OSI(Module Device)
[    0.216008] ACPI: Added _OSI(Processor Device)
[    0.218633] ACPI: Added _OSI(3.0 _SCP Extensions)
[    0.220003] ACPI: Added _OSI(Processor Aggregator Device)
[    0.226206] ACPI: Interpreter enabled
[    0.228007] ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S1_] (20150930/hwxface-580)
[    0.232004] ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S2_] (20150930/hwxface-580)
[    0.236014] ACPI: (supports S0 S3 S5)
[    0.238241] ACPI: Using IOAPIC for interrupt routing
[    0.240027] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[    0.281206] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
[    0.284010] acpi PNP0A03:00: _OSC: OS supports [Segments MSI]
[    0.287096] acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM
[    0.288012] acpi PNP0A03:00: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge.
[    0.292813] acpiphp: Slot [0] registered
[    0.295712] acpiphp: Slot [3] registered
[    0.296222] acpiphp: Slot [4] registered
[    0.298765] acpiphp: Slot [5] registered
[    0.300259] acpiphp: Slot [6] registered
[    0.302652] acpiphp: Slot [7] registered
[    0.304216] acpiphp: Slot [8] registered
[    0.308225] acpiphp: Slot [9] registered
[    0.310722] acpiphp: Slot [10] registered
[    0.312308] acpiphp: Slot [11] registered
[    0.314824] acpiphp: Slot [12] registered
[    0.316224] acpiphp: Slot [13] registered
[    0.318807] acpiphp: Slot [14] registered
[    0.320219] acpiphp: Slot [15] registered
[    0.322674] acpiphp: Slot [16] registered
[    0.324234] acpiphp: Slot [17] registered
[    0.326934] acpiphp: Slot [18] registered
[    0.328229] acpiphp: Slot [19] registered
[    0.330763] acpiphp: Slot [20] registered
[    0.332223] acpiphp: Slot [21] registered
[    0.334718] acpiphp: Slot [22] registered
[    0.336218] acpiphp: Slot [23] registered
[    0.338643] acpiphp: Slot [24] registered
[    0.340221] acpiphp: Slot [25] registered
[    0.342714] acpiphp: Slot [26] registered
[    0.344225] acpiphp: Slot [27] registered
[    0.346930] acpiphp: Slot [28] registered
[    0.348220] acpiphp: Slot [29] registered
[    0.350710] acpiphp: Slot [30] registered
[    0.352219] acpiphp: Slot [31] registered
[    0.354681] PCI host bridge to bus 0000:00
[    0.356009] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7 window]
[    0.359512] pci_bus 0000:00: root bus resource [io  0x0d00-0xffff window]
[    0.360008] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
[    0.364006] pci_bus 0000:00: root bus resource [mem 0xf0000000-0xfbffffff window]
[    0.368011] pci_bus 0000:00: root bus resource [bus 00-ff]
[    0.375672] pci 0000:00:01.1: legacy IDE quirk: reg 0x10: [io  0x01f0-0x01f7]
[    0.376008] pci 0000:00:01.1: legacy IDE quirk: reg 0x14: [io  0x03f6]
[    0.379386] pci 0000:00:01.1: legacy IDE quirk: reg 0x18: [io  0x0170-0x0177]
[    0.380009] pci 0000:00:01.1: legacy IDE quirk: reg 0x1c: [io  0x0376]
[    0.383879] * Found PM-Timer Bug on the chipset. Due to workarounds for a bug,
[    0.383879] * this clock source is slow. Consider trying other clock sources
[    0.385211] pci 0000:00:01.3: quirk: [io  0xb000-0xb03f] claimed by PIIX4 ACPI
[    0.396216] ACPI: PCI Interrupt Link [LNKA] (IRQs *5 10 11)
[    0.404256] ACPI: PCI Interrupt Link [LNKB] (IRQs 5 *10 11)
[    0.410474] ACPI: PCI Interrupt Link [LNKC] (IRQs 5 10 *11)
[    0.415337] ACPI: PCI Interrupt Link [LNKD] (IRQs *5 10 11)
[    0.436240] ACPI: Enabled 2 GPEs in block 00 to 0F
[    0.440062] xen:balloon: Initialising balloon driver
[    0.448321] vgaarb: setting as boot device: PCI:0000:00:02.0
[    0.452000] vgaarb: device added: PCI:0000:00:02.0,decodes=io+mem,owns=io+mem,locks=none
[    0.452022] vgaarb: loaded
[    0.454531] vgaarb: bridge control possible 0000:00:02.0
[    0.456828] PCI: Using ACPI for IRQ routing
[    0.460533] NetLabel: Initializing
[    0.463368] NetLabel:  domain hash size = 128
[    0.464005] NetLabel:  protocols = UNLABELED CIPSOv4
[    0.467383] NetLabel:  unlabeled traffic allowed by default
[    0.468038] HPET: 3 timers in total, 0 timers will be used for per-cpu timer
[    0.472018] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
[    0.477589] hpet0: 3 comparators, 64-bit 62.500000 MHz counter
[    0.483171] amd_nb: Cannot enumerate AMD northbridges
[    0.484100] clocksource: Switched to clocksource xen
[    0.495944] pnp: PnP ACPI init
[    0.498640] system 00:00: [mem 0x00000000-0x0009ffff] could not be reserved
[    0.503010] system 00:01: [io  0x08a0-0x08a3] has been reserved
[    0.506697] system 00:01: [io  0x0cc0-0x0ccf] has been reserved
[    0.515935] system 00:01: [io  0x04d0-0x04d1] has been reserved
[    0.519966] system 00:07: [io  0x10c0-0x1141] has been reserved
[    0.523643] system 00:07: [io  0xb044-0xb047] has been reserved
[    0.544452] pnp: PnP ACPI: found 8 devices
[    0.553656] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[    0.560007] NET: Registered protocol family 2
[    0.563015] TCP established hash table entries: 8192 (order: 4, 65536 bytes)
[    0.567169] TCP bind hash table entries: 8192 (order: 5, 131072 bytes)
[    0.571168] TCP: Hash tables configured (established 8192 bind 8192)
[    0.575002] UDP hash table entries: 512 (order: 2, 16384 bytes)
[    0.578666] UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
[    0.582506] NET: Registered protocol family 1
[    0.585339] pci 0000:00:00.0: Limiting direct PCI/PCI transfers
[    0.589172] pci 0000:00:01.0: PIIX3: Enabling Passive Release
[    0.593108] pci 0000:00:01.0: Activating ISA DMA hang workarounds
[    0.597877] Unpacking initramfs...
[    0.841454] Freeing initrd memory: 16384K (ffff880036ff0000 - ffff880037ff0000)
[    0.846312] RAPL PMU detected, API unit is 2^-32 Joules, 3 fixed counters 655360 ms ovfl timer
[    0.850828] hw unit of domain pp0-core 2^-14 Joules
[    0.853792] hw unit of domain package 2^-14 Joules
[    0.857974] hw unit of domain dram 2^-16 Joules
[    0.861852] futex hash table entries: 4096 (order: 6, 262144 bytes)
[    0.866694] audit: initializing netlink subsys (disabled)
[    0.871163] audit: type=2000 audit(1477782599.969:1): initialized
[    0.875680] Initialise system trusted keyring
[    0.880636] HugeTLB registered 2 MB page size, pre-allocated 0 pages
[    0.885792] VFS: Disk quotas dquot_6.6.0
[    0.888421] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    0.893092] Key type asymmetric registered
[    0.896303] Asymmetric key parser 'x509' registered
[    0.899347] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 254)
[    0.904199] io scheduler noop registered (default)
[    0.907393] pci_hotplug: PCI Hot Plug PCI Core version: 0.5
[    0.911100] xen:grant_table: Grant tables using version 1 layout
[    0.914930] Grant table initialized
[    0.917528] Cannot get hvm parameter CONSOLE_EVTCHN (18): -22!
[    0.920922] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[    0.950850] 00:06: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
[    0.955717] xen_netfront: Initialising Xen virtual ethernet driver
[    0.959941] i8042: PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
[    0.965837] serio: i8042 KBD port at 0x60,0x64 irq 1
[    0.968822] serio: i8042 AUX port at 0x60,0x64 irq 12
[    0.971861] hidraw: raw HID events driver (C) Jiri Kosina
[    0.975810] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0
[    0.982026] NET: Registered protocol family 17
[    0.984862] registered taskstats version 1
[    0.987596] Loading compiled-in X.509 certificates
[    0.990841] Loaded X.509 cert 'Build time autogenerated kernel key: 054f0c1b128bf74c3f19178fbc44f02713572a16'
[    0.997203] zswap: default zpool zbud not available
[    1.001259] zswap: pool creation failed
[    1.220869] blkfront: xvda: barrier or flush: disabled; persistent grants: disabled; indirect descriptors: enabled;
[    1.280109]  xvda: xvda1
[    1.350427] blkfront: xvdcz: barrier or flush: disabled; persistent grants: disabled; indirect descriptors: enabled;
[    1.405667] Freeing unused kernel memory: 1188K (ffffffff81af8000 - ffffffff81c21000)
[    1.412272] Write protecting the kernel read-only data: 10240k
[    1.416606] Freeing unused kernel memory: 1128K (ffff8800014e6000 - ffff880001600000)
[    1.426591] Freeing unused kernel memory: 1664K (ffff880001860000 - ffff880001a00000)
[    1.451952] dm_mod: module verification failed: signature and/or required key missing - tainting kernel
[    1.465727] device-mapper: uevent: version 1.0.3
[    1.472325] device-mapper: ioctl: 4.34.0-ioctl (2015-10-28) initialised: dm-devel@redhat.com
/init: 140: cannot create /sys/class/firmware/timeout: Directory nonexistent
[    1.495548] udevd[792]: starting version 173
[    1.523589] SCSI subsystem initialized
[    1.598865] scsi host0: ata_piix
[    1.628035] scsi host1: ata_piix
[    1.631547] ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc100 irq 14
[    1.637085] ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc108 irq 15
growroot: NOCHANGE: disk=/dev/xvda partition=1: size=16773086, it cannot be grown
[    1.860086] tsc: Refined TSC clocksource calibration: 2400.001 MHz
[    1.864817] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x22983858435, max_idle_ns: 440795258295 ns
[    1.875742] EXT4-fs (xvda1): mounted filesystem with ordered data mode. Opts: (null)
[    2.152080] dracut: Remounting /dev/disk/by-label/\x2f with -o noatime,ro
[    2.186004] EXT4-fs (xvda1): mounted filesystem with ordered data mode. Opts: (null)
[    2.222872] dracut: Mounted root filesystem /dev/xvda1
[    2.853733] dracut: Loading SELinux policy
[    4.178669] random: nonblocking pool is initialized
[    4.362090] dracut: /sbin/load_policy: Can't load policy: No such device
[    4.422147] dracut: Switching root
image_name="amzn-ami-ecs-hvm"
image_version="2016.09"
image_arch="x86_64"
image_file="amzn-ami-ecs-hvm-2016.09.a.x86_64.ext4.gpt"
image_stamp="9f44-4a8d"
image_date="20161020002636"
recipe_name="amzn ami"
recipe_id="0204e849-ebe2-8eb8-192b-9cdc-d93e-661d-b2475c18"
        Welcome to Amazon Linux AMI
Starting udev: [    9.180100] udevd[1518]: starting version 173
[    9.842381] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input3
[    9.849674] ACPI: Power Button [PWRF]
[    9.853856] input: Sleep Button as /devices/LNXSYSTM:00/LNXSLPBN:00/input/input4
[    9.861304] ACPI: Sleep Button [SLPF]
[    9.881778] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input5
[    9.906662] mousedev: PS/2 mouse device common for all mice
[  OK  ]

Setting hostname localhost.localdomain:  [  OK  ]

Setting up Logical Volume Management: [  OK  ]

Checking filesystems
Checking all file systems.
[/sbin/fsck.ext4 (1) -- /] fsck.ext4 -a /dev/xvda1 
/: clean, 17141/524288 files, 226156/2096635 blocks
[  OK  ]

Remounting root filesystem in read-write mode:  [   11.993193] EXT4-fs (xvda1): re-mounted. Opts: (null)
[  OK  ]

Mounting local filesystems:  [  OK  ]

Enabling /etc/fstab swaps:  [  OK  ]

Entering non-interactive startup
[   13.564634] NET: Registered protocol family 10
Bringing up loopback interface:  [  OK  ]

Bringing up interface eth0:  
Determining IP information for eth0... done.

Determining IPv6 information for eth0... done.
[  OK  ]

Starting auditd: [   15.895895] audit: type=1305 audit(1477782614.993:2): audit_pid=2163 old=0 auid=4294967295 ses=4294967295 res=1
[  OK  ]

Starting system logger: [  OK  ]

Starting rngd: [  OK  ]

Mounting filesystems:  [  OK  ]

Retrigger failed udev events--type=failed is deprecated and will be removed from a future udev release.
[  OK  ]

Starting cloud-init: Cloud-init v. 0.7.6 running 'init-local' at Sat, 29 Oct 2016 23:10:20 +0000. Up 21.69 seconds.
Starting cloud-init: Cloud-init v. 0.7.6 running 'init' at Sat, 29 Oct 2016 23:10:21 +0000. Up 22.50 seconds.
ci-info: ++++++++++++++++++++++Net device info+++++++++++++++++++++++
ci-info:  Device   Up    Address         Mask          Hw-Address    
ci-info:    lo    True  127.0.0.1     255.0.0.0            .         
ci-info:   eth0   True  200.0.1.81  255.255.255.0  02:4f:1a:f4:0a:3f 
ci-info: ++++++++++++++++++++++++++++++Route info++++++++++++++++++++++++++++++
ci-info:  Route    Destination     Gateway       Genmask      Interface  Flags 
ci-info:    0        0.0.0.0      200.0.1.1      0.0.0.0         eth0      UG  
ci-info:    1    169.254.169.254   0.0.0.0   255.255.255.255     eth0      UH  
ci-info:    2       200.0.1.0      0.0.0.0    255.255.255.0      eth0      U   
INFO: Volume group backing root filesystem could not be determined
File descriptor 6 (/var/log/cloud-init.log) leaked on vgs invocation. Parent PID 2258: /bin/bash
Checking that no-one is using this disk right now ...
OK
sfdisk:  /dev/xvdcz: unrecognized partition table type

sfdisk: No partitions found

Disk /dev/xvdcz: 2871 cylinders, 255 heads, 63 sectors/track
Old situation:
New situation:
Units: sectors of 512 bytes, counting from 0

   Device Boot    Start       End   #sectors  Id  System
/dev/xvdcz1          2048  46137343   46135296  8e  Linux LVM
/dev/xvdcz2             0         -          0   0  Empty
/dev/xvdcz3             0         -          0   0  Empty
/dev/xvdcz4             0         -          0   0  Empty
Warning: partition 1 does not end at a cylinder boundary
Warning: no primary partition is marked bootable (active)
This does not matter for LILO, but the DOS MBR will not boot this disk.
Successfully wrote the new partition table

[   23.744505]  xvdcz: xvdcz1
Re-reading the partition table ...

If you created or changed a DOS partition, /dev/foo7, say, then use dd(1)
to zero the first 512 bytes:  dd if=/dev/zero of=/dev/foo7 bs=512 count=1
(See fdisk(8).)
File descriptor 6 (/var/log/cloud-init.log) leaked on pvcreate invocation. Parent PID 2258: /bin/bash
  Physical volume "/dev/xvdcz1" successfully created
File descriptor 6 (/var/log/cloud-init.log) leaked on vgcreate invocation. Parent PID 2258: /bin/bash
  Volume group "docker" successfully created
File descriptor 6 (/var/log/cloud-init.log) leaked on lvs invocation. Parent PID 2303: /bin/bash
File descriptor 6 (/var/log/cloud-init.log) leaked on vgs invocation. Parent PID 2258: /bin/bash
File descriptor 6 (/var/log/cloud-init.log) leaked on lvcreate invocation. Parent PID 2258: /bin/bash
  Rounding up size to full physical extent 24.00 MiB
  Logical volume "docker-poolmeta" created.
File descriptor 6 (/var/log/cloud-init.log) leaked on vgs invocation. Parent PID 2258: /bin/bash
File descriptor 6 (/var/log/cloud-init.log) leaked on vgs invocation. Parent PID 2326: /bin/bash
File descriptor 6 (/var/log/cloud-init.log) leaked on lvcreate invocation. Parent PID 2258: /bin/bash
  Logical volume "docker-pool" created.
File descriptor 6 (/var/log/cloud-init.log) leaked on lvconvert invocation. Parent PID 2258: /bin/bash
  WARNING: Converting logical volume docker/docker-pool and docker/docker-poolmeta to pool's data and metadata volumes.
  THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
[   26.038587] device-mapper: thin: Data device (dm-1) discard unsupported: Disabling discard passdown.
  Converted docker/docker-pool to thin pool.
File descriptor 6 (/var/log/cloud-init.log) leaked on lvchange invocation. Parent PID 2258: /bin/bash
[   26.297412] device-mapper: thin: Data device (dm-1) discard unsupported: Disabling discard passdown.
  Logical volume "docker-pool" changed.
File descriptor 6 (/var/log/cloud-init.log) leaked on lvs invocation. Parent PID 2412: /bin/bash
File descriptor 6 (/var/log/cloud-init.log) leaked on lvm invocation. Parent PID 2454: /bin/bash
File descriptor 6 (/var/log/cloud-init.log) leaked on lvchange invocation. Parent PID 2258: /bin/bash
  Logical volume "docker-pool" changed.
Generating public/private rsa key pair.
Your identification has been saved in /etc/ssh/ssh_host_rsa_key.
Your public key has been saved in /etc/ssh/ssh_host_rsa_key.pub.
***Removed key generation***
Starting cloud-init: Cloud-init v. 0.7.6 running 'modules:config' at Sat, 29 Oct 2016 23:10:31 +0000. Up 32.44 seconds.
Loaded plugins: priorities, update-motd, upgrade-helper

 One of the configured repositories failed (Unknown),
 and yum doesn't have enough cached data to continue. At this point the only
 safe thing yum can do is fail. There are a few ways to work "fix" this:

     1. Contact the upstream for the repository and get them to fix the problem.

     2. Reconfigure the baseurl/etc. for the repository, to point to a working
        upstream. This is most often useful if you are using a newer
        distribution release than is supported by the repository (and the
        packages for the previous distribution release still work).

     3. Disable the repository, so yum won't use it by default. Yum will then
        just ignore the repository until you permanently enable it again or use
        --enablerepo for temporary usage:

            yum-config-manager --disable <repoid>

     4. Configure the failing repository to be skipped, if it is unavailable.
        Note that yum will try to contact the repo. when it runs most commands,
        so will have to try and fail each time (and thus. yum will be be much
        slower). If it is a very temporary problem though, this is often a nice
        compromise:

            yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true

Cannot find a valid baseurl for repo: amzn-main/latest
Could not retrieve mirrorlist http://repo.eu-west-1.amazonaws.com/latest/main/mirror.list error was
12: Timeout on http://repo.eu-west-1.amazonaws.com/latest/main/mirror.list: (28, 'Connection timed out after 10001 milliseconds')
Oct 29 23:10:48 cloud-init[2528]: util.py[WARNING]: Package upgrade failed
Oct 29 23:10:48 cloud-init[2528]: cc_package_update_upgrade_install.py[WARNING]: 1 failed with exceptions, re-raising the last one
Oct 29 23:10:48 cloud-init[2528]: util.py[WARNING]: Running module package-update-upgrade-install (<module 'cloudinit.config.cc_package_update_upgrade_install' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_package_update_upgrade_install.pyc'>) failed
Generating SSH2 ED25519 host key: [  OK  ]

Starting sshd: [  OK  ]

ntpdate: Synchronizing with time server: [FAILED]

Starting ntpd: [  OK  ]

Starting sendmail: [  OK  ]

Starting sm-client: [  OK  ]

Starting crond: [  OK  ]

Starting cgconfig service: [  OK  ]

Starting docker:    .[  OK  ]

Starting cloud-init: Cloud-init v. 0.7.6 running 'modules:final' at Sat, 29 Oct 2016 23:11:34 +0000. Up 94.94 seconds.
***Removed SSH KEYS***
Cloud-init v. 0.7.6 finished at Sat, 29 Oct 2016 23:11:34 +0000. Datasource DataSourceEc2.  Up 95.09 seconds

Amazon Linux AMI release 2016.09
Kernel 4.4.23-31.54.amzn1.x86_64 on an x86_64

ip-200-0-1-81 login: [  103.651807] bridge: automatic filtering via arp/ip/ip6tables has been deprecated. Update your scripts to load br_netfilter if you need this.
[  103.659588] Bridge firewalling registered
[  103.751032] nf_conntrack version 0.5.0 (7963 buckets, 31852 max)
[  105.219353] ip_tables: (C) 2000-2006 Netfilter Core Team
[  105.412265] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready
[  107.346271] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[  107.434968] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[  107.678261] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[  107.758828] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[  107.974748] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[  108.054801] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[  108.117794] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[  110.144071] cgroup: docker-runc (2946) created nested cgroup for controller "memory" which has incomplete hierarchy support. Nested cgroups may change behavior in the future.
[  110.165931] cgroup: "memory" requires setting use_hierarchy to 1 on the root
[  130.702565] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[  130.779325] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[  130.830256] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[  152.185866] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[  152.267865] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[  152.330366] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[  172.973130] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[  173.050653] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[  173.102037] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16668.359343] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16668.437616] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16668.490141] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16688.935697] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16689.017478] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16689.074235] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16709.584907] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16709.661713] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16709.720292] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16730.169752] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16730.237310] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16730.310058] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16750.847298] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16750.917911] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16750.970554] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16771.435976] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16771.503553] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16771.560675] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16792.038840] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16792.113931] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16792.165412] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16812.686028] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16812.773236] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16812.829217] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16833.255573] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16833.340099] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16833.411080] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16853.936477] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16854.012866] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16854.065137] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16874.519482] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[16874.581568] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)

I don't see anywhere the docker image pulled ... maybe this is related ?

Thanks again for your time and support, Grm

brikis98 commented 7 years ago

There are some odd errors in that syslog about yum. Do you have any custom network ACLs or security groups setup?

To troubleshoot further, you can SSH to the EC2 Instance and look at the ECS Logs.

grm commented 7 years ago

I have no default ACL or security group. I just created the VPC using terraform and not the interface. I can give you the terraform file I used to create the VPC.

How can i connect to the private instance ? There is only a private ip on my ec2 instance.

brikis98 commented 7 years ago

Does your VPC have route tables? Subnets? If not, then perhaps this is failing because nothing can be routed within that VPC.

As a test, could you try deploying this ECS cluster in your default VPC instead?

grm commented 7 years ago

Yes, my VPC has a route table and subnets. I tried to make it simple to start and the see how to evolve it.

Here is the file i use to create it :

# Define a vpc
resource "aws_vpc" "simple_VPC" {
  cidr_block = "200.0.0.0/23"
  tags {
    Name = "simple-VPC"
  }
}

# Internet gateway for the public subnet
resource "aws_internet_gateway" "simple_IG" {
  vpc_id = "${aws_vpc.simple_VPC.id}"
  tags {
    Name = "simple_IG"
  }
}

# Public subnet
resource "aws_subnet" "simple_public_subnet" {
  vpc_id = "${aws_vpc.simple_VPC.id}"
  cidr_block = "200.0.0.0/24"
  availability_zone = "${var.zone}"
  tags {
    Name = "simple_public_subnet"
  }
}

# Routing table for public subnet
resource "aws_route_table" "simple_routing_table" {
  vpc_id = "${aws_vpc.simple_VPC.id}"
  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = "${aws_internet_gateway.simple_IG.id}"
  }
  tags {
    Name = "simple-routing-table"
  }
}

# Associate the routing table to public subnet
resource "aws_route_table_association" "simple_public_subnet_route_association" {
  subnet_id = "${aws_subnet.simple_public_subnet.id}"
  route_table_id = "${aws_route_table.simple_routing_table.id}"
}

# Private subnet
resource "aws_subnet" "simple_private_subnet" {
  vpc_id = "${aws_vpc.simple_VPC.id}"
  cidr_block = "200.0.1.0/24"
  availability_zone = "${var.zone}"
  tags {
    Name = "simple_private_subnet"
  }
}

# Associate the routing table to public subnet
resource "aws_route_table_association" "simple_private_subnet_route_association" {
  subnet_id = "${aws_subnet.simple_private_subnet.id}"
  route_table_id = "${aws_route_table.simple_routing_table.id}"
}

Maybe I did something wrong. My EC2 instance are in the private one, and my ELB is in the public one.

In the meantime, I will try to replace my vpc with the default one and see if it changes anything.

grm commented 7 years ago

I may have an issue with my VPC. It is working using the default one ... but using it, I have my container published on Internet and having a public IP.

Do you have any idea what is wrong with my VPC configuration ?

brikis98 commented 7 years ago

I'm not sure. One issue is that if you want a private subnet, you should be using a NAT Gateway and not an Internet Gateway (see here for more info).

Getting route tables, subnets, ACLs, gateways, etc correct is tricky. We created pre-assembled Infrastructure Packages, including a VPC package (you can see the public docs here), precisely so everyone wouldn't have to figure out all of these pieces from scratch :)

Since this seems to be an issue with VPC configuration, and the ECS stuff works in the default VPC, I'm going to close the bug.