vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
17.58k stars 1.54k forks source link

AWS ECS documentation #4105

Open binarylogic opened 4 years ago

binarylogic commented 4 years ago

I'm opening this issue to represent a single, final, place for AWS ECS documentation. This issue will be used to build out the website, docs, and marketing pages.

binarylogic commented 4 years ago

@ktff just pinging you on this. If you could provide setup documentation I can start to work on this. Just a simple step-by-step instruction like you did for Kubernetes.

ktff commented 4 years ago

For ECS, simplest/most general deployment that we currently support is that of a sidecar container with splunk_hec source.


Deployment guide for Fargate

This is a guide on how to collect logs from containers in a single AWS ECS task for Fargate. This is achieved by adding Vector as a container to your task definition and redirecting logs of your containers collected by Docker to Vector. And to achieve that, we are going to transport the logs using splunk protocol over the loopback network interface (localhost).

We can do that by adding configuration in two places:

  1. In Vector configuration file add splunk_hec source. This source will receive logs from your containers in the same task. So you will have:

    [sources.my_source_id] type = "splunk_hec"

    1. In ECS task definition, two things need to be achieved:

      • All your containers should have:

        • logConfiguration parameter with following content:
        • logDriver should be splunk
        • options should have at least following content:

          • splunk-url should be http://0.0.0.0:8088
          • splunk-token can have any value.

          This will configure your containers to use Docker's splunk log driver which will send container's logs to Vector container.

        • dependsOn parameter with following content:

          • containerName should be vector.
          • condition should be HEALTHY.

          This will postpone starting your containers until Vector is ready to accept logs.

        As json this would look like:

          "logConfiguration": {
            "logDriver": "splunk",
            "options": {
              "splunk-url": "http://0.0.0.0:8088",
              "splunk-token": ""
            }
          },
          "dependsOn": [
            {
              "containerName": "vector",
              "condition": "HEALTHY"
            }
          ]
      • Container with Vector should have:

        • name should be vector.
        • healthCheck parameter with following content:

          • command array should have two items in given order:
            • CMD-SHELL
            • curl -f http://0.0.0.0:8088/services/collector/health || exit 1

          This command will check if splunk_hec source is running.

        As json this would look like:

        "name": "vector", "healthCheck": { "command": [ "CMD-SHELL", "curl -f http://0.0.0.0:8088/services/collector/health || exit 1" ], }

That's all of the necessary configuration.

The only thing remaining is to deploy your task how you see fit.


Other sections

Memory and CPU

Vector's memory and processor usage and recommended limitations can be found at https://vector.dev/docs/setup/deployment/roles/agent#system-configuration.

Passing Vector configuration file

Configuration file needs to be accessible to Vector. One of easier ways to achieve that is to build an image with Vector and it's configuration file on default config path.

Debugging

For debugging purposes, one way is to log Vector with awslogs logDriver. With that, you will be able to debug configuration errors.

ktff commented 4 years ago

@binarylogic This guide covers both EC2 and Fargate. Although it only focuses on what needs to be configured as there are two main ways of deploying on ECS. Through console and through the website. Both of them could be covered in a separate guides/sections. This guide is their shared part, and should be enough for those who already know how to deploy containers on ECS.

The guide requires timberio/vector#1784.

binarylogic commented 4 years ago

Thanks! At first glance, this looks good.

ktff commented 4 years ago

Container dependency configuration has been added.


This guide covers both EC2 and Fargate

scratch that. Guides could be simpler if they are specialized. So the original one is for Fargate, and I'll add a separate one for EC2.

LucioFranco commented 4 years ago

@ktff does vector get deployed as a sort of daemonset? How do we know that it will always exist on 0.0.0.0:8080?

ktff commented 4 years ago

@LucioFranco

does vector get deployed as a sort of daemonset?

Vector is deployed as a regular container.

How do we know that it will always exist on 0.0.0.0:8080?

User containers are configured to wait for splunk_hec source to become available on 0.0.0.0:8080 before they start running.

awangc commented 3 years ago

@ktff Is there any networking mode that needs be configured? I tested the setup you described (just used port 8080 instead of 8088), and I am seeing:

ResourceInitializationError: failed to validate logger args: Options http://0.0.0.0:8080/services/collector/event/1.0: dial tcp 0.0.0.0:8080: connect: connection refused : exit status 1

I am using timberio/vector:0.10.0-debian as vector image

ktff commented 3 years ago

@awangc did you add

  address = "0.0.0.0:8080" 

to configuration of splunk_hec. If that address isn't specified then the source will use 8088 port by default.

Regarding networking mode, on Fargate only supported mode is awsvpc so the guide assumes that. EC2 is a slightly different story.

For EC2 with different networking mode you will need to ensure that the port of Vector container is accessible from your containers which would possibly require changing contents of Vector's portMappings. So the container would end up with something like this

"portMappings": [
  {
    "hostPort": 8080,
    "protocol": "tcp",
    "containerPort": 8080
  }
]
awangc commented 3 years ago

@ktff Yes, I have that part in the vector.toml file:

[sources.app_log]
  type = "splunk_hec"
  address = "0.0.0.0:8080"
  token = "aabbccddeeff"

Also I have EXPOSE 8080 in my Dockerfile and I'm testing in Fargate, thanks

awangc commented 3 years ago

@ktff I found out the setup does not work for Fargate platform 1.4 (which is the one I had been trying) but works for LATEST (1.3 if I'm not mistaken). Possible reason is that container start order is not being respected https://github.com/aws/containers-roadmap/issues/849 ?

ktff commented 3 years ago

Possible reason is that container start order is not being respected

@awangc Yes, that explains the error message. The app container logger is trying to establish connection before starting app container, and if the vector container isn't running at that time, there is nothing listening. It also seams that they are trying to connect only once, at the time of the start. In that case there are two options:

leobudima commented 3 years ago

@ktff I found out the setup does not work for Fargate platform 1.4 (which is the one I had been trying) but works for LATEST (1.3 if I'm not mistaken). Possible reason is that container start order is not being respected aws/containers-roadmap#849 ?

Thank you for your insight, I have the same issue - did you manage to work around this on 1.4?

gtorre commented 2 years ago

Any updates here? Thanks!

thedannywilcox commented 2 years ago

Found this coming from the internet while working on getting it setup. Not sure if everything works, but I was able to get Vector setup on the LATEST Fargate (1.4) without issue using File source and cloudwatch sink.

edeno1 commented 2 years ago

I'm trying to make it the container stops working

madan-wego commented 2 years ago

By Adding splunk-verify-connection": "false" it worked for me in AWS Fargate and Ec2.

logConfiguration": { "logDriver": "splunk", "options": { "splunk-url": "http://0.0.0.0:8088", "splunk-verify-connection": "false", "splunk-token": "abc1234567890" } }