aws / aws-for-fluent-bit

The source of the amazon/aws-for-fluent-bit container image
Apache License 2.0
462 stars 134 forks source link

Fluent bit sidecar container unable to push logs to cloudwatch in EKS Fargate #251

Open nitin194 opened 3 years ago

nitin194 commented 3 years ago
### Describe the question/issue Fluent bit sidecar container unable to push logs to the cloudwatch in EKS Fargate. Desired policy has been attached to the Fargate pod execution role. ### Configuration ``` apiVersion: v1 kind: ConfigMap metadata: name: fluent-bit-config namespace: eretail labels: k8s-app: fluent-bit data: fluent-bit.conf: | [SERVICE] Flush 5 Log_Level info Daemon off Parsers_File parsers.conf # HTTP_Server On # HTTP_Listen 0.0.0.0 # HTTP_Port 2020 @INCLUDE application-log.conf application-log.conf: | [INPUT] Name tail Path /logs/boot/*.log Tag boot.* Parser docker Mem_Buf_Limit 5MB Refresh_Interval 10 [INPUT] Name tail Path /logs/access/*.log Tag access.* Parser docker Mem_Buf_Limit 5MB Refresh_Interval 10 [OUTPUT] Name cloudwatch Match *boot* region ap-southeast-1 log_group_name eks-fluent-bit log_stream_prefix ${HOSTNAME}-boot-log- auto_create_group true # workers 1 [OUTPUT] Name cloudwatch Match *access* region ap-southeast-1 log_group_name eks-fluent-bit log_stream_prefix ${HOSTNAME}-access-log- auto_create_group true # workers 1 parsers.conf: | [PARSER] Name docker Format json Time_Key time Time_Format %Y-%m-%dT%H:%M:%S.%LZv ```

Fluent Bit Log Output

Fluent Bit v1.7.4
* Copyright (C) 2019-2021 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2021/10/04 10:36:01] [ info] [engine] started (pid=1)
[2021/10/04 10:36:01] [ info] [storage] version=1.1.1, initializing...
[2021/10/04 10:36:01] [ info] [storage] in-memory
[2021/10/04 10:36:01] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 0] plugin parameter log_group_name = 'eks-fluent-bit'"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 0] plugin parameter default_log_group_name = 'fluentbit-default'"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 0] plugin parameter log_stream_prefix = 'ip-17-225-20-45.ap-southeast-1.compute.internal-boot-log-'"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 0] plugin parameter log_stream_name = ''"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 0] plugin parameter default_log_stream_name = '/fluentbit-default'"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 0] plugin parameter region = 'ap-southeast-1'"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 0] plugin parameter log_key = ''"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 0] plugin parameter role_arn = ''"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 0] plugin parameter auto_create_group = 'true'"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 0] plugin parameter new_log_group_tags = ''"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 0] plugin parameter log_retention_days = '0'"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 0] plugin parameter endpoint = ''"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 0] plugin parameter sts_endpoint = ''"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 0] plugin parameter credentials_endpoint = "
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 0] plugin parameter log_format = ''"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 1] plugin parameter log_group_name = 'eks-fluent-bit'"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 1] plugin parameter default_log_group_name = 'fluentbit-default'"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 1] plugin parameter log_stream_prefix = 'ip-17-225-20-45.ap-southeast-1.compute.internal-access-log-'"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 1] plugin parameter log_stream_name = ''"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 1] plugin parameter default_log_stream_name = '/fluentbit-default'"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 1] plugin parameter region = 'ap-southeast-1'"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 1] plugin parameter log_key = ''"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 1] plugin parameter role_arn = ''"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 1] plugin parameter auto_create_group = 'true'"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 1] plugin parameter new_log_group_tags = ''"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 1] plugin parameter log_retention_days = '0'"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 1] plugin parameter endpoint = ''"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 1] plugin parameter sts_endpoint = ''"
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 1] plugin parameter credentials_endpoint = "
time="2021-10-04T10:36:01Z" level=info msg="[cloudwatch 1] plugin parameter log_format = ''"
[2021/10/04 10:36:01] [ info] [sp] stream processor started
[2021/10/04 10:36:01] [ info] [input:tail:tail.0] inotify_fs_add(): inode=1460720 watch_fd=1 name=/logs/boot/INTEGRATOR.log
[2021/10/04 10:36:01] [ info] [input:tail:tail.0] inotify_fs_add(): inode=1460719 watch_fd=2 name=/logs/boot/server.log

Fluent Bit Version Info

Fluent Bit v1.7.4

Cluster Details

AWS EKS Fargate Cluster App Mesh is being used VPC is not network restricted

Application Details

Steps to reproduce issue

Please note main application docker image is not public

apiVersion: apps/v1
kind: Deployment
metadata:
  name: vinintegrator
  namespace: eretail
  labels:
    app: vinintegrator
    pod: fargate
spec:
  selector:
    matchLabels:
      app: vinintegrator
      pod: fargate
  replicas: 1
  template:
    metadata:
      labels:
        app: vinintegrator
        pod: fargate
    spec:
      securityContext:
        fsGroup: 0
      serviceAccount: eretail
      initContainers:    # Setup your log directory here
      - name: setup
        image: busybox
        command: ["bin/ash", "-c"]
        args:
        - >
          mkdir -p /logs/boot /logs/access;
          chmod -R 777 /logs
        volumeMounts:
        - name: logs
          mountPath: /logs
      containers:
      - name: vinintegrator
        imagePullPolicy: IfNotPresent
        image: 657281443710.dkr.ecr.ap-southeast-1.amazonaws.com/vinintegrator-service:latest
        resources:
          limits:
            memory: "7629Mi"
            cpu: "1.5"
          requests:
            memory: "5435Mi"
            cpu: "750m"
        ports:
        - containerPort: 8177
          protocol: TCP
        - containerPort: 80
          protocol: TCP
        # securityContext:
          # runAsUser: 506
          # runAsGroup: 506
        volumeMounts:
          - mountPath: /jboss-eap-6.4-integration/bin
            name: bin
          - mountPath: /logs
            name: logs
      - name: fluent-bit
        image: amazon/aws-for-fluent-bit:2.14.0
        imagePullPolicy: IfNotPresent
        env:
          - name: HOST_NAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
          - name: POD_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: POD_NAMESPACE
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 200m
            memory: 100Mi
        # ports:
        # - containerPort: 80
          # protocol: TCP
        volumeMounts:
        - name: fluent-bit-config
          mountPath: /fluent-bit/etc/
        - name: logs
          mountPath: /logs
          readOnly: true
      volumes:
        - name: fluent-bit-config
          configMap:
            name: fluent-bit-config
        - name: logs
          emptyDir: {}
        - name: bin
          persistentVolumeClaim:
            claimName: vinintegrator-pvc

Related Issues

https://stackoverflow.com/q/69404530/9548311

hossain-rayhan commented 3 years ago

Hi @nitin194, the logs look clean to me. Can you please wait for a while and send more debug logs if possible with some error and warn messages?

Also, Can you please try using another output plugin (may be stdout output plugin) instead of cloudwatch? Just to make sure its an real issue with CloudWatch output plugin?

PettitWesley commented 3 years ago

@nitin194 Also, I am curious how you got the Fluent Bit log output? Since Fluent Bit is run as a hidden process, last time I checked, they don't support allowing customers to see its logs.

nitin194 commented 3 years ago

Hi @nitin194, the logs look clean to me. Can you please wait for a while and send more debug logs if possible with some error and warn messages?

Also, Can you please try using another output plugin (may be stdout output plugin) instead of cloudwatch? Just to make sure its an real issue with CloudWatch output plugin?

Thank you for your suggestions. However, I am not sure where does the log output will go using stdout, Will it be shown in the same log file?

nitin194 commented 3 years ago

@nitin194 Also, I am curious how you got the Fluent Bit log output? Since Fluent Bit is run as a hidden process, last time I checked, they don't support allowing customers to see its logs.

We are running Fluentbit as a sidecar for customized logging rather than the usual AWS fluent bit config setup. Thats why we are able to see the container specific logs

matthewfala commented 3 years ago

Hi, @nitin194, If you use stdout, your logs will appear in the same place as the fluent bit info logs.

Could you also change your log level to debug, and show the Fluent Bit error logs when you encounter this problem?

sufiyanghori commented 2 years ago

@nitin194 did you figure it out?

nitin194 commented 2 years ago

@nitin194 did you figure it out?

Sadly not @sufiyanghori

sufiyanghori commented 2 years ago

@nitin194 did you figure it out?

Sadly not @sufiyanghori

For me it worked after I mount the volume to both container and the fluentbit sidecar.

nitin194 commented 2 years ago

@nitin194 did you figure it out?

Sadly not @sufiyanghori

For me it worked after I mount the volume to both container and the fluentbit sidecar.

That sounds awesome ... would you be able to share the sample manifest for the benefit of others too?

radhey86 commented 2 years ago

Ensure you have service account attached to your deployment.

The Amazon EC2 instance metadata service (IMDS) isn't available to pods that are deployed to Fargate nodes. If you have pods that are deployed to Fargate that need IAM credentials, assign them to your pods using IAM roles for service accounts. If your pods need access to other information available through IMDS, then you must hard code this information into your pod spec. This includes the AWS Region or Availability Zone that a pod is deployed to.