humio / humio-helm-charts

Helm Charts for Humio Components
Apache License 2.0
9 stars 17 forks source link

The pod keeps on restarting with the error: http liveness probe return 404, unhealthy #107

Open SidGrundfos opened 4 years ago

SidGrundfos commented 4 years ago

Hello Team, I have this deployed to dev, Test, QA and prod environments and it was running fine till few days ago. suddenly in one of the environments (Test) the pods started restarting and failing continuously with error image

SidGrundfos commented 4 years ago

This is how I deployed helm install humio humio/humio-helm-charts --namespace logging --set humio-fluentbit.token=*** -f humio-agent.yaml

`
humio-agent.yaml

humio-fluentbit:
  enabled: true
  humioHostname: cloud.humio.com
  humioRepoName: gic
  es:
    tls: true
    tls_verify: true
    autodiscovery: false 
  nameOverride: ""
  fullnameOverride: ""

  tokenSecretName: ""
  tokenSecretKeyName: token

  resources:
    limits:
      cpu: 100m
      memory: 128Mi
    requests:
      cpu: 100m
      memory: 128Mi
  serviceConfig: |-
    [SERVICE]
        Flush        1
        Daemon       Off
        Log_Level    info
        Parsers_File parsers.conf

  inputConfig: |-
    [INPUT]
        Name             tail
        Path             /var/log/containers/*.log
        Parser           docker
        Tag              kube.*
        Refresh_Interval 5
        Mem_Buf_Limit    20MB
        Skip_Long_Lines  On  

    [INPUT]
        Name             tail
        Path             /var/log/containers/*.log
        Parser           docker
        Tag              audit.*
        Refresh_Interval 5
        Mem_Buf_Limit    20MB
        Skip_Long_Lines  On     

  filterConfig: |-
    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_Tag_Prefix     kube.var.log.containers.
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Merge_Log           On
        Merge_Log_Key       log_processed
        K8S-Logging.Parser  On
        K8S-Logging.Exclude On

    [FILTER]
        Name         nest
        Match        kube.*
        Operation    lift
        Nested_under log_processed

    [FILTER]
        Name         nest
        Match        kube.*
        Operation    lift
        Nested_under Properties

    [FILTER]
        Name     grep
        Match    kube.*
        Exclude  Category Audit

    [FILTER]
        Name                kubernetes
        Match               audit.*
        Kube_Tag_Prefix     kube.var.log.containers.
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Merge_Log           On
        Merge_Log_Key       log_audits
        K8S-Logging.Parser  On
        K8S-Logging.Exclude On

    [FILTER]
        Name         nest
        Match        audit.*
        Operation    lift
        Nested_under log_audits

    [FILTER]
        Name         nest
        Match        audit.*
        Operation    lift
        Nested_under Properties

    [FILTER]
        Name     grep
        Match    audit.*
        Regex    Category Audit    

  outputConfig: |-
    [OUTPUT]
        Name  es
        Match kube.*
        Host ${FLUENT_ELASTICSEARCH_HOST}
        Port ${FLUENT_ELASTICSEARCH_PORT}
        tls ${FLUENT_ELASTICSEARCH_TLS}
        tls.verify ${FLUENT_ELASTICSEARCH_TLS_VERIFY}
        HTTP_User ${HUMIO_REPO_NAME}
        HTTP_Passwd ${HUMIO_INGEST_TOKEN}
        Logstash_Format On
        Retry_Limit False
        Type  flb_type
        Time_Key @timestamp
        Replace_Dots On
        Logstash_Prefix FluentBitHelmChart
        Buffer_Size 20MB

    [OUTPUT]
        Name  es
        Match audit.*
        Host ${FLUENT_ELASTICSEARCH_HOST}
        Port ${FLUENT_ELASTICSEARCH_PORT}
        tls ${FLUENT_ELASTICSEARCH_TLS}
        tls.verify ${FLUENT_ELASTICSEARCH_TLS_VERIFY}
        HTTP_User ***
        HTTP_Passwd None
        Logstash_Format On
        Retry_Limit False
        Type  flb_type
        Time_Key @timestamp
        Replace_Dots On
        Logstash_Prefix FluentBitHelmChart
        Buffer_Size 20MB

   parserConfig: |-
    [PARSER]
        Name   apache
        Format regex
        Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?"              (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
    Name   apache2
    Format regex
    Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
    Time_Key time
    Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
    Name   apache_error
    Format regex
    Regex  ^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\](?: \[pid (?<pid>[^\]]*)\])?( \[client (?<client>[^\]]*)\])? (?<message>.*)$

[PARSER]
    Name   nginx
    Format regex
    Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")
    Time_Key time
    Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
    Name   json
    Format json
    Time_Key time
    Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
    Name        docker
    Format      json
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L
    Time_Keep   Off

[PARSER]
    Name        syslog
    Format      regex
    Regex       ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
    Time_Key    time
    Time_Format %b %d %H:%M:%S

  customFluentBitConfig:
    custom.conf: |-

  nodeSelector: {}
  tolerations: []
  affinity: {}    

`

SaaldjorMike commented 4 years ago

Hi @SidGrundfos

Could you perhaps share the logs for the fluent bit containers? Maybe that can help us understand why it returns 404's.

SidGrundfos commented 4 years ago

image

image

There are 2 pods image

5 restarts in 5 mins

SaaldjorMike commented 4 years ago

Hm. Not much to go on here. Perhaps you could try changing the fluentbit image to be a more recent version just to validate if it was already fixed upstream? Our helm chart currently installs fluentbit 1.4.2 but the latest 1.4 build is 1.4.6. If that also fails, you could also give fluentbit 1.5.2 a try, but I'm not sure if that would require any configuration changes.

SidGrundfos commented 4 years ago

Thanks @SaaldjorMike, but I am guessing the issue is with the conf. When I install with default configuration it works fine but when I use the given configuration it fails. But also it does not manke much sense as it is same all throughout environments but the issue is only in one env.

SidGrundfos commented 4 years ago

For upgrade how do I install 1.4.6 or 1.5.2? This repo contains charts with v 1.4.2

SaaldjorMike commented 4 years ago

You can override the fluentbit image by setting the image for humio-fluentbit like this:

humio-fluentbit:
  enabled: true
  image: fluent/fluent-bit:1.4.6
  ...
SidGrundfos commented 4 years ago

Nopw with 1.4.6 it is the same outcome but now there is at-least some more logs -

Fluent Bit v1.4.6

[2020/08/03 08:24:54] [ info] [storage] version=1.0.3, initializing... [2020/08/03 08:24:54] [ info] [storage] in-memory [2020/08/03 08:24:54] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128 [2020/08/03 08:24:54] [ info] [engine] started (pid=1) [2020/08/03 08:24:55] [ info] [filter:kubernetes:kubernetes.0] https=1 host=kubernetes.default.svc port=443 [2020/08/03 08:24:55] [ info] [filter:kubernetes:kubernetes.0] local POD info OK [2020/08/03 08:24:55] [ info] [filter:kubernetes:kubernetes.0] testing connectivity with API server... [2020/08/03 08:24:55] [ info] [filter:kubernetes:kubernetes.0] API server connectivity OK [2020/08/03 08:24:55] [ info] [filter:kubernetes:kubernetes.4] https=1 host=kubernetes.default.svc port=443 [2020/08/03 08:24:55] [ info] [filter:kubernetes:kubernetes.4] local POD info OK [2020/08/03 08:24:55] [ info] [filter:kubernetes:kubernetes.4] testing connectivity with API server... [2020/08/03 08:24:55] [ info] [filter:kubernetes:kubernetes.4] API server connectivity OK [2020/08/03 08:24:55] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020 [2020/08/03 08:24:55] [ info] [sp] stream processor started [2020/08/03 08:24:58] [ info] inotify_fs_add(): inode=825773 watch_fd=1 name=/var/log/containers/calico-node-n9lph_kube-system_install-cni-3f104d9d4ee1c4e18ed344691f022776d2af46b80909e2c2ab92178fbea42637.log [2020/08/03 08:25:06] [ info] inotify_fs_add(): inode=3619134 watch_fd=2 name=/var/log/containers/nginx-nginx-ingress-default-backend-766d57499b-2kmvg_ingress_nginx-ingress-default-backend-82d896297c525d946c6cec157196556e26170a3042b100c21dcfd4fbb676f34e.log [2020/08/03 08:25:08] [ info] inotify_fs_add(): inode=1313691 watch_fd=3 name=/var/log/containers/tunnelfront-77df88464c-rfln5_kube-system_tunnel-probe-b98359e608005abbdd5dc212069cc55e1daded884c0e646e70ac63cba036fa58.log [2020/08/03 08:25:13] [ info] inotify_fs_add(): inode=825773 watch_fd=1 name=/var/log/containers/calico-node-n9lph_kube-system_install-cni-3f104d9d4ee1c4e18ed344691f022776d2af46b80909e2c2ab92178fbea42637.log [2020/08/03 08:25:20] [ info] inotify_fs_add(): inode=3619134 watch_fd=2 name=/var/log/containers/nginx-nginx-ingress-default-backend-766d57499b-2kmvg_ingress_nginx-ingress-default-backend-82d896297c525d946c6cec157196556e26170a3042b100c21dcfd4fbb676f34e.log [2020/08/03 08:25:22] [ info] inotify_fs_add(): inode=1313691 watch_fd=3 name=/var/log/containers/tunnelfront-77df88464c-rfln5_kube-system_tunnel-probe-b98359e608005abbdd5dc212069cc55e1daded884c0e646e70ac63cba036fa58.log [engine] caught signal (SIGTERM) [2020/08/03 08:25:22] [ info] [input] pausing tail.0 [2020/08/03 08:25:22] [ info] [input] pausing tail.1 [2020/08/03 08:25:24] [ warn] [engine] service will stop in 5 seconds [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=823199 watch_fd=4 name=/var/log/containers/alertmanager-6b88cc865c-hdl4f_openfaas_alertmanager-8aee22477622e5efa26e2240f0ab8c23315f301ad1489ded26a28612ecea1907.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=802020 watch_fd=5 name=/var/log/containers/alertnatsadapter-85c7d5db7f-rnbvn_openfaas-fn_alertnatsadapter-ff42866984cff0f4cee7b02af1753e8f0c2102eaaa3985ed3725b2a4b615a532.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=822955 watch_fd=6 name=/var/log/containers/alertnatsadapter-85c7d5db7f-rnbvn_openfaas-fn_linkerd-init-c31cff5e2c535b0d21f9f90e6ecafe746527446bab54499ffb9a296094059ec6.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=796510 watch_fd=7 name=/var/log/containers/alertnatsadapter-85c7d5db7f-rnbvn_openfaas-fn_linkerd-proxy-76a6244466a56a7483df30ad17f3f091ff17515db077932a9c3ebeb94cd863c2.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=849965 watch_fd=8 name=/var/log/containers/alertservice-74f8649cdc-4rhl2_openfaas-fn_alertservice-1c76a76fe3e743ef390e3e229edeeabddeb0a701dc95267ea751f973696815a6.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=822173 watch_fd=9 name=/var/log/containers/alertservice-74f8649cdc-4rhl2_openfaas-fn_alertservice-7b59a6486d21ec6be0c23e575039612002afa99f932b433dd3b1017ed324be5c.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=799688 watch_fd=10 name=/var/log/containers/alertservice-74f8649cdc-4rhl2_openfaas-fn_linkerd-init-5999ad9a7075a3145052d10baf30be31c349ce4147e31ced9a9df60dd7dfeb4a.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=822542 watch_fd=11 name=/var/log/containers/alertservice-74f8649cdc-4rhl2_openfaas-fn_linkerd-proxy-fbd06fc70164b34ec7fd9faf92b3bd79b60e0e6e362ca8c46d2c2b0a0d2f5313.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=779864 watch_fd=12 name=/var/log/containers/alertservice-74f8649cdc-b724j_openfaas-fn_alertservice-8c22239912cfa8b692e546210b0df7f433d0e580a0672bed90ea0e9214a13966.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=821851 watch_fd=13 name=/var/log/containers/alertservice-74f8649cdc-b724j_openfaas-fn_linkerd-init-f15f24da6e32545cf0f5cb028d61d80a709c6079d7f55236e47515807a6d5643.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=821191 watch_fd=14 name=/var/log/containers/alertservice-74f8649cdc-b724j_openfaas-fn_linkerd-proxy-24d529ea70d21d6045699ee1ef298001d150df32a4237bb322b4993b342a4466.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=779392 watch_fd=15 name=/var/log/containers/alertservice-74f8649cdc-xzspt_openfaas-fn_alertservice-ab0230cca5ce4f2ab8999c5691aa06553761711153598760d56d9c1c57a234df.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=803867 watch_fd=16 name=/var/log/containers/alertservice-74f8649cdc-xzspt_openfaas-fn_linkerd-init-b3691a7786fb517a0370e7e0a6d059c280ebe885f91a910023574b133732848e.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=801952 watch_fd=17 name=/var/log/containers/alertservice-74f8649cdc-xzspt_openfaas-fn_linkerd-proxy-763a956ac4b8a7fb30e0dfb80352aa791765976bc96dc60e303ed25661363087.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=823912 watch_fd=18 name=/var/log/containers/assetfacade-865d7799fb-44tfk_openfaas-fn_assetfacade-6268e045c3446a6f702c37b9f78aa8d2ac00608b8bd98e6a0a7cec70a82f3103.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=822086 watch_fd=19 name=/var/log/containers/assetfacade-865d7799fb-44tfk_openfaas-fn_linkerd-init-e6f59ba234f2fb99f1b22231cd51a953dd5468013769fa652fefd2d98d6af781.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=823940 watch_fd=20 name=/var/log/containers/assetfacade-865d7799fb-44tfk_openfaas-fn_linkerd-proxy-83974585b76416d15ce21a4590d66e79400c0f765a54e5e85e88a27e56aeeb23.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=857132 watch_fd=21 name=/var/log/containers/basic-auth-plugin-569c94df8-lsrx7_openfaas_linkerd-init-2a0683569b7369d922843097e763620b84bcec8bf92e01bf2951d5421324dce7.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=858244 watch_fd=22 name=/var/log/containers/basic-auth-plugin-569c94df8-lsrx7_openfaas_linkerd-proxy-d91e9ef47bbdfbe68fa778f787fc4a12bb13eea7b81e79a1dca91b5793c501f7.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=846092 watch_fd=23 name=/var/log/containers/calculationtriggers-689cd7d659-78xd5_openfaas-fn_calculationtriggers-a3c7fd29404abaa92658a002b28e1efec89660e28c2aa1b9ed52ea8c1db41c71.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=830912 watch_fd=24 name=/var/log/containers/calculationtriggers-689cd7d659-78xd5_openfaas-fn_linkerd-init-83143c65deb99af18bc560907b561dc0e344a3cc7af3b606fe666afd22c805b4.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=822103 watch_fd=25 name=/var/log/containers/calculationtriggers-689cd7d659-78xd5_openfaas-fn_linkerd-proxy-066212fd9bb9b31be13444ef696f7796ff8c64309dc8516df4fe22b7fa15d707.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=846097 watch_fd=26 name=/var/log/containers/calculationtriggers-689cd7d659-ddlth_openfaas-fn_calculationtriggers-aeee3a2e2d6167452675c76234af3e709d3e55195ed08489914aa7de02ef0686.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=843414 watch_fd=27 name=/var/log/containers/calculationtriggers-689cd7d659-ddlth_openfaas-fn_linkerd-init-240e014899df00dd7194f469e9028b92ffcd2a3044418842637cb11745e8adb2.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=847400 watch_fd=28 name=/var/log/containers/calculationtriggers-689cd7d659-ddlth_openfaas-fn_linkerd-proxy-381117be88be472c0d517630a943f34dc1d09b3eb8311061b10f9b2d7e4baed7.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=1313677 watch_fd=29 name=/var/log/containers/calico-typha-horizontal-autoscaler-67bf6ccd8c-n6sjb_kube-system_autoscaler-02aff464960f8e29802c7c4f7f990582c5d833c45148082efd75c6184860e0bd.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=3877856 watch_fd=30 name=/var/log/containers/cert-manager-webhook-77ccf5c8b4-kbv7m_letsencrypt_cert-manager-a202e89af17956c5c04c431daabe856a6e538563e02c73dc82f8a0612cde5df7.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=3877891 watch_fd=31 name=/var/log/containers/cert-manager-webhook-77ccf5c8b4-kbv7m_letsencrypt_cert-manager-d34921938a0ee8a26b869292bb25421f510c55dbc3e3362d3119906b852d0977.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=849540 watch_fd=32 name=/var/log/containers/cloud2cloud-6c794ff57b-n94j6_openfaas-fn_cloud2cloud-2308df36d77ffc8dd3143b1512849e15a5f0a5dd4e7ccb6434e8bb6e88afef7e.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=844659 watch_fd=33 name=/var/log/containers/cloud2cloud-6c794ff57b-n94j6_openfaas-fn_linkerd-init-2b454db9976b352ce79ce170e95f5f6a3a96c988862a5964928018c94ef551c4.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=849944 watch_fd=34 name=/var/log/containers/cloud2cloud-6c794ff57b-n94j6_openfaas-fn_linkerd-proxy-1020e89a214911fbea735c42901cac62b19509dca48707078250e28015e173ea.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=1311744 watch_fd=35 name=/var/log/containers/coredns-autoscaler-d5944b569-l476p_kube-system_autoscaler-73e02a3ddb686e6e125f263b1a072da5319aec79ff2fe6cb760794105382f8d7.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=857817 watch_fd=36 name=/var/log/containers/dataprovider-d58bf7766-td5tg_openfaas-fn_linkerd-init-910d58f4e6998ef500589085af5b204011a1102a83bad015d5a997f92f438d1b.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=1037036 watch_fd=37 name=/var/log/containers/dataprovider-d58bf7766-td5tg_openfaas-fn_linkerd-proxy-1004b679e504734f77111c27694230ee83983ad9ebfa7b2f9fc7091d89016179.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=782164 watch_fd=38 name=/var/log/containers/grundfosgicwwncapi-79df5fb6d7-2txw8_wwnc_linkerd-init-4ea63b6c61fb842a538f25f624309adba17b725ccfa3c874c74e2aa109af0a3c.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=796946 watch_fd=39 name=/var/log/containers/grundfosgicwwncapi-79df5fb6d7-2txw8_wwnc_linkerd-proxy-93941bca5ada10143dff8a9099025c8ccdb4af9f8372675cea85c59a71f6b39d.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=779408 watch_fd=40 name=/var/log/containers/grundfosgicwwncapi-79df5fb6d7-lvkrn_wwnc_linkerd-init-aa9fe1b131924cf46ef7a82f2ccde08dc4edc3fe894f03b90d245610c2a00654.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=780323 watch_fd=41 name=/var/log/containers/grundfosgicwwncapi-79df5fb6d7-lvkrn_wwnc_linkerd-proxy-480dce43211402a45b4c3847c86cce5d58105c785e87ee097c3c6f7f3a4dd6e1.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=847413 watch_fd=42 name=/var/log/containers/grundfosgicwwnccachingapi-6cdff8fc69-mwhd4_wwnc_linkerd-init-cee6686491c18521fea31488fe7a1f373c2e12bccc7697e385e002106b525fa0.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=847412 watch_fd=43 name=/var/log/containers/grundfosgicwwnccachingapi-6cdff8fc69-mwhd4_wwnc_linkerd-proxy-8f1fa1f81ca0d92471e6bd96c6f01d7f7284d2355b905e8ad507fc163e67be43.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=846315 watch_fd=44 name=/var/log/containers/grundfosgicwwncdataaesubscriber-78586c66d5-6t5zr_wwnc_linkerd-init-c49b9a2213556508a466f11fed1cf046f781e61776db2ea6598d85c922238753.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=857787 watch_fd=45 name=/var/log/containers/grundfosgicwwncdataaesubscriber-78586c66d5-6t5zr_wwnc_linkerd-proxy-2189d8c7af89441d305838a8906bd57d6eee9e6d3440808f4399ad102e1f8031.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=847450 watch_fd=46 name=/var/log/containers/grundfosgicwwncmessagehubapi-676b4fb78c-vmh9v_wwnc_linkerd-init-f224fa154c1fe2596556a412e30071c9f245ae57f8fd3c4883bbd81030346f4e.log [2020/08/03 08:25:24] [ info] inotify_fs_add(): inode=858354 watch_fd=47 name=/var/log/containers/grundfosgicwwncmessagehubapi-676b4fb78c-vmh9v_wwnc_linkerd-proxy-d66b67c481847336c5c010d5d7f6cee89b18d77ab1d0fbb9e36793524a074ed4.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=858067 watch_fd=48 name=/var/log/containers/grundfosgicwwncnetworkapi-77f5f6b5b6-bgjl8_wwnc_linkerd-init-5528c824301021b2dd1b8e2ab1472cc309bad86bc046fbf5cec5d81aeb939942.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=865421 watch_fd=49 name=/var/log/containers/grundfosgicwwncnetworkapi-77f5f6b5b6-bgjl8_wwnc_linkerd-proxy-441d87344452b3a8c240ad95efb340f5b3793b928b5971d35fed7b838cf0b966.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=858053 watch_fd=50 name=/var/log/containers/grundfosgicwwncrawdataapi-b5548d99b-wnqpc_wwnc_linkerd-init-bf17c8a876a8c946dd2930b389bf7487bb6a60badfb7d9bd4961291075c07ab1.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=867531 watch_fd=51 name=/var/log/containers/grundfosgicwwncrawdataapi-b5548d99b-wnqpc_wwnc_linkerd-proxy-e82351592746250d60d05de1944f75e5cb81be4f01465bf2a31e6a20a9e8777d.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=857796 watch_fd=52 name=/var/log/containers/grundfosgicwwncrawdatasubscriber-854f4f89b9-2kr52_wwnc_linkerd-init-114a769f5cea00e3ae98940d1e43e91c2807fc7dbcd7fc792dc205d3e355bcfb.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=867462 watch_fd=53 name=/var/log/containers/grundfosgicwwncrawdatasubscriber-854f4f89b9-2kr52_wwnc_linkerd-proxy-8b9d38cad1312821d853b5d2973bf6c3086a93224cbff65565f48f78081c400a.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=857831 watch_fd=54 name=/var/log/containers/grundfosgicwwncrawdatasubscriber-854f4f89b9-nlbjd_wwnc_linkerd-init-ae488c21692389806ad67798714448b0b329cdea352ee65c086bcf4dd8fee1dd.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=867484 watch_fd=55 name=/var/log/containers/grundfosgicwwncrawdatasubscriber-854f4f89b9-nlbjd_wwnc_linkerd-proxy-dd5f0d48c2cba8d059398631bf63851084f3dbfa54168b97bec4a6b45b3fd2c9.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=1313930 watch_fd=56 name=/var/log/containers/humio-fluentbit-bqb8l_logging_humio-54c65688ef32bbddc957de6eecd4da3dd4da842a0e8d4410f7e299d400dc7ce8.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=822287 watch_fd=57 name=/var/log/containers/linkerd-identity-6bd877f4cf-l84nz_linkerd_linkerd-init-eb9ac2909c0d2e51c036a31a3403038dfa8bd0ba15fe5389e99b2cd9b9c36605.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=857652 watch_fd=58 name=/var/log/containers/linkerd-identity-6bd877f4cf-l84nz_linkerd_linkerd-proxy-72160d6d20c6abf82c8506e3546bfef3d079d0915cf5543a6c0a9f1c36cc58e9.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=803694 watch_fd=59 name=/var/log/containers/linkerd-prometheus-69f78595b7-5cg46_linkerd_linkerd-init-687d4c3d73ba7e2ced2ac977448d35207d364e1fac3b55c07feba78a7f3cf5c0.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=818929 watch_fd=60 name=/var/log/containers/linkerd-tap-6568dcc9d4-qtxvg_linkerd_linkerd-init-c990148dde33b54a1686905c2df6fdbd4ca7dd22ae9fb5d9a9eaa32e3d5c7c54.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=821193 watch_fd=61 name=/var/log/containers/linkerd-tap-6568dcc9d4-qtxvg_linkerd_linkerd-proxy-cfb6d55dbef48720b0fe37a1ae82dd0495ec3d6ad85538a92813981adfead694.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=820274 watch_fd=62 name=/var/log/containers/linkerd-tap-6568dcc9d4-qtxvg_linkerd_tap-06fc5ed34bac1a1ffccd64df31abbc53b6374053da239c245d33e5603e106728.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=6214090 watch_fd=63 name=/var/log/containers/nats-cluster-1_nats-io_nats-e49b0776ad1e28a6e0d1e9f869bcd4cc10e9b6c9e44a5d827bdcf916a289d609.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=6214115 watch_fd=64 name=/var/log/containers/nats-cluster-3_nats-io_nats-800553d8cb506c98db4d6baeca13ad6901a4ecdbe0d7c070718dcbd8f5ce5fc5.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=858623 watch_fd=65 name=/var/log/containers/nats-connector-589b796478-4x4vk_openfaas_nats-connector-34c3ec3d447436507bc9e8cbfc954cff67181f2654e6f9c39f08b02fcf635512.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=779412 watch_fd=66 name=/var/log/containers/nats-connector-589b796478-4x4vk_openfaas_nats-connector-3f64649a77d05fcbe8da86fb7e0806de513b4732f11008fc44c4b644628cc702.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=796101 watch_fd=67 name=/var/log/containers/nats-operator-669766fdfd-dl6nj_nats-io_nats-operator-7f0445a47c90195554adb306bd7d438208011320bdfd325686369bb6b6901fc6.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=788668 watch_fd=68 name=/var/log/containers/nats-operator-669766fdfd-dl6nj_nats-io_nats-operator-b6954aab63bac7d7ecfe78c68b34abac7e61e9a6f96b75a30f57a6821875f9c8.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=822959 watch_fd=69 name=/var/log/containers/queue-worker-dcbfb56d6-22zn5_openfaas_linkerd-init-e6498f21dfda2d51ed9b169e4e42bd6141e7e65f411405f1763c312f8c3dd964.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=824599 watch_fd=70 name=/var/log/containers/queue-worker-dcbfb56d6-22zn5_openfaas_linkerd-proxy-897689fc7695164a605383b8608e72067d16732cc78bf2fe35df7c3aeb9d4fe8.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=844725 watch_fd=71 name=/var/log/containers/queue-worker-dcbfb56d6-dh8bv_openfaas_linkerd-init-e76c18f0595d2c95b321a85c8ce5fa9944f00d2b0150a5fd4c2874763bf2ce16.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=801399 watch_fd=72 name=/var/log/containers/queue-worker-dcbfb56d6-dh8bv_openfaas_linkerd-proxy-0bb88c082d1f0ad5d20bd32a41c09bfa7447de78bc2c081c927b7e25149d8e22.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=822958 watch_fd=73 name=/var/log/containers/queue-worker-dcbfb56d6-l6q25_openfaas_linkerd-init-548452a7bbfd6bf0bb3a6ba516ff14024410243b8b0c30d643aab382f44cb06a.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=824553 watch_fd=74 name=/var/log/containers/queue-worker-dcbfb56d6-l6q25_openfaas_linkerd-proxy-6a27f377fad797b7b92bd7b6fc9d0fc17867ff449317269a373e58f26bcd0e3d.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=824489 watch_fd=75 name=/var/log/containers/queue-worker-dcbfb56d6-l6q25_openfaas_queue-worker-f0a13049f62284cf8b0335499342ff9182061fcde42754137fa920501fd4e7c2.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=844685 watch_fd=76 name=/var/log/containers/secretsmapper-7d7b6fcffb-h5b74_openfaas-fn_linkerd-init-6df43091ef1e9d91648f9945c04b4d04ed821e296dcb5402499e204479b0bdd9.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=848265 watch_fd=77 name=/var/log/containers/secretsmapper-7d7b6fcffb-h5b74_openfaas-fn_linkerd-proxy-b98bf57e323d15437f2c9405c3558bb51a6c967ffe633b45eeca629cff281a64.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=847881 watch_fd=78 name=/var/log/containers/secretsmapper-7d7b6fcffb-h5b74_openfaas-fn_secretsmapper-212c75e0c9ab1522d4ca479477236613207f93a4e4f8c5ebe8237abe5a480cb4.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=1154491 watch_fd=79 name=/var/log/containers/secretsmapper-7d7b6fcffb-h5b74_openfaas-fn_secretsmapper-f20e17b5f06beae1d44abe9fd7b1a42fb8640bdc4cc64c0b667f3c3c1f31fbbe.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=857856 watch_fd=80 name=/var/log/containers/wwnc-adapterapi-5df5697585-8f2n7_openfaas-fn_linkerd-init-6f13d4f4bdfb90c43e6e806e187db2ce849af3e9a9a1c14030b61427662c174e.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=1065322 watch_fd=81 name=/var/log/containers/wwnc-adapterapi-5df5697585-8f2n7_openfaas-fn_linkerd-proxy-756c8d43fb0c47592725b94c2e8fb4745b272eca35289290aa6efc0d5ccb208e.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=857140 watch_fd=82 name=/var/log/containers/wwnc-calculations-6f797c8d79-r6d9f_openfaas-fn_linkerd-init-3a819b7d41eee9be34c36adadf1c531457d0a0498fa1fe320ba8bd245171a09d.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=1154467 watch_fd=83 name=/var/log/containers/wwnc-calculations-6f797c8d79-r6d9f_openfaas-fn_linkerd-proxy-942734d28feb22ed0d2cbf286cebbdad6aacf5bcdcf6cae07005af18f2f846fe.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=1313765 watch_fd=84 name=/var/log/containers/wwnc-calculations-streaming-67b46b979-l44rw_openfaas-fn_linkerd-init-9510deb24268c2e2231b26790c0eb121edcb713b8d584ae66bf8b4402b28a318.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=1154495 watch_fd=85 name=/var/log/containers/wwnc-calculations-streaming-67b46b979-l44rw_openfaas-fn_linkerd-proxy-7821555768bcca2bfd9598c053b8849844c5fbeb96c1a0dc1cdce48cb3807a19.log [2020/08/03 08:25:25] [ info] inotify_fs_add(): inode=867506 watch_fd=86 name=/var/log/containers/grundfosgicwwncrawdataapi-b5548d99b-wnqpc_wwnc_grundfosgicwwncrawdataapi-aaf39ceda887176c60c72bb4d593bdf5188ab7f45f5e29eadfb285c8b775f0bc.log [2020/08/03 08:25:26] [ info] inotify_fs_add(): inode=6214066 watch_fd=87 name=/var/log/containers/nats-streaming-operator-5659759f86-gnrfp_nats-io_nats-streaming-operator-df63def6e12a02e6b149770256de4e2cb66a1cb2ef2c2c0afc423bd195eb1501.log [2020/08/03 08:25:26] [ info] inotify_fs_add(): inode=3618913 watch_fd=88 name=/var/log/containers/tiller-deploy-94685958f-8dwf7_kube-system_tiller-f32e286fb96daef63640b038f6d4f46fc4eec3461e975c7fc54e9908abe39e25.log [2020/08/03 08:25:27] [ info] inotify_fs_add(): inode=780293 watch_fd=89 name=/var/log/containers/calico-node-n9lph_kube-system_calico-node-f1b8a5021b7846db0e0ad357ef432371139314c2835ded1f04574fe3ce7144df.log

SidGrundfos commented 4 years ago

image

SaaldjorMike commented 4 years ago

@SidGrundfos Do you see any logs being shipped each time it restarts, or is it completely stuck? The one thing I do notice is the log lines containing this:

[2020/08/03 08:25:22] [ info] [input] pausing tail.0
[2020/08/03 08:25:22] [ info] [input] pausing tail.1

Not sure why we see this here, but it is typically related to the fact that fluentbit is unable to fully process the log entries. By default, it is configured to keep retrying (notice the Retry_Limit option in the outputs). My assumption is that somehow fluentbit is unable to ship log entries, causing it to fill up the tail input plugin buffer (defined by Mem_Buf_Limit), and when that buffer is full, it is unable to read new messages and ship those.

Also, did you turn off debug logging again? I don't see any debug level logs in those logs you shared. Maybe it just didn't log any?

SidGrundfos commented 4 years ago

Yes I stopped debug logging. I have also tried to increase the buffer to 50 MB still doesn't work.

keatinle commented 4 years ago

Had the same issue and managed to get it working again by upping the resources

humio-fluentbit:
  enabled: true
  humioHostname: HOSTNAME
  es:
    tls: true
  resources:
    limits:
      cpu: 2
      memory: 1024Mi
    requests:
      cpu: 1
      memory: 512Mi