logstash-plugins / logstash-input-azure_event_hubs

Logstash input for consuming events from Azure Event Hubs
Apache License 2.0
18 stars 28 forks source link

Logstash logs bloated with errors/warnings #65

Open maris-jurgenbergs opened 3 years ago

maris-jurgenbergs commented 3 years ago

We get different errors and warnings, that bloat our logstash logs. Getting rid of them would be best, but i am troubleshooting the cause of them here, because the logstash plugin is the one handling these errors/warnings.

I understand this is from the microsoft library, but does this not mean that the azure event hub plugin is not correctly executed or set up?

[2021-03-12T07:36:33,625][ERROR][com.microsoft.azure.eventprocessorhost.PumpManager][....-pipeline][719a950a821fba7e0946da2859ca0b00f6a05393f387cb6d34063dbca9f45466] host logstash-1f1d8eb8-824e-4c31-b669-21ce265682e0: 0: throwing away zombie pump
[2021-03-12T07:36:33,625][ERROR][com.microsoft.azure.eventprocessorhost.PumpManager][....-pipeline][719a950a821fba7e0946da2859ca0b00f6a05393f387cb6d34063dbca9f45466] host logstash-1f1d8eb8-824e-4c31-b669-21ce265682e0: 1: throwing away zombie pump

We get a transient storage failures, but logstash catches them as errors, but it should ignored them since they are just info level.

[2021-03-12T07:57:44,097][ERROR][logstash.inputs.azure.errornotificationhandler][....-pipeline][719a950a821fba7e0946da2859ca0b00f6a05393f387cb6d34063dbca9f45466] Error with Event Processor Host.  {:host_name=>"logstash-1f1d8eb8-824e-4c31-b669-21ce265682e0", :action=>"Renewing Lease", :exception=>"com.microsoft.azure.storage.StorageException: The client could not finish the operation within specified maximum execution timeout."}
[2021-03-12T07:57:44,097][INFO ][com.microsoft.azure.eventprocessorhost.PartitionPump][....-pipeline][719a950a821fba7e0946da2859ca0b00f6a05393f387cb6d34063dbca9f45466] host logstash-1f1d8eb8-824e-4c31-b669-21ce265682e0: 0: Transient failure renewing lease
com.microsoft.azure.storage.StorageException: The client could not finish the operation within specified maximum execution timeout.
        at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:243) ~[azure-storage-8.0.0.jar:?]
        at com.microsoft.azure.storage.blob.CloudBlob.renewLease(CloudBlob.java:2682) ~[azure-storage-8.0.0.jar:?]
        at com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager.renewLeaseInternal(AzureStorageCheckpointLeaseManager.java:514) ~[azure-eventhubs-eph-2.4.0.jar:?]
        at com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager.renewLease(AzureStorageCheckpointLeaseManager.java:497) ~[azure-eventhubs-eph-2.4.0.jar:?]
        at com.microsoft.azure.eventprocessorhost.PartitionPump.leaseRenewer(PartitionPump.java:418) ~[azure-eventhubs-eph-2.4.0.jar:?]
        at com.microsoft.azure.eventprocessorhost.PartitionPump.lambda$scheduleLeaseRenewer$11(PartitionPump.java:167) ~[azure-eventhubs-eph-2.4.0.jar:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: java.util.concurrent.TimeoutException: The client could not finish the operation within specified maximum execution timeout.
        at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:242) ~[azure-storage-8.0.0.jar:?]
        ... 11 more

We are getting these very periodic. Either would be nice to increase the timeout or skip logs like this. They really polute the log files.

[2021-03-12T02:19:05,992][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][....-pipeline][18c4ca558b08b3b9aa48d566c294be6f1dfa7b464da81c7150942bc50be35373] host logstash-9e4cdcf4-2a76-44fe-a046-e09b75713118: 0: Failure updating checkpoint
com.microsoft.azure.storage.StorageException: The client could not finish the operation within specified maximum execution timeout.
        at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:243) ~[azure-storage-8.0.0.jar:?]
        at com.microsoft.azure.storage.blob.CloudBlob.renewLease(CloudBlob.java:2682) ~[azure-storage-8.0.0.jar:?]
        at com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager.renewLeaseInternal(AzureStorageCheckpointLeaseManager.java:514) ~[azure-eventhubs-eph-2.4.0.jar:?]
        at com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager.updateLeaseInternal(AzureStorageCheckpointLeaseManager.java:591) ~[azure-eventhubs-eph-2.4.0.jar:?]
        at com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager.updateCheckpoint(AzureStorageCheckpointLeaseManager.java:176) [azure-eventhubs-eph-2.4.0.jar:?]
        at com.microsoft.azure.eventprocessorhost.PartitionContext.checkpoint(PartitionContext.java:198) [azure-eventhubs-eph-2.4.0.jar:?]
        at com.microsoft.azure.eventprocessorhost.PartitionContext.checkpoint(PartitionContext.java:177) [azure-eventhubs-eph-2.4.0.jar:?]
        at jdk.internal.reflect.GeneratedMethodAccessor91.invoke(Unknown Source) ~[?:?]
        at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
        at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
        at org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(JavaMethod.java:456) [jruby-complete-9.2.13.0.jar:?]
        at org.jruby.javasupport.JavaMethod.invokeDirect(JavaMethod.java:317) [jruby-complete-9.2.13.0.jar:?]
        at org.jruby.java.invokers.InstanceMethodInvoker.call(InstanceMethodInvoker.java:42) [jruby-complete-9.2.13.0.jar:?]

We also get these epoch errors. Is this error somehow created by multiple threads in the azure event hub plugin?

[2021-03-09T20:37:50,554][ERROR][logstash.inputs.azure.processor][....-pipeline]
[ba2387301f3557a552d831c2e777dae8f69c245695332ed7c788bdc3c28769be] Event Hub: some-hub, Partition: 1 experienced an error com.microsoft.azure.eventhubs.ReceiverDisconnectedException: Receiver 'nil' with a higher epoch '79' already exists. Receiver 'nil' with epoch 78 cannot be created.
Make sure you are creating receiver with increasing epoch value to ensure connectivity, or ensure all old epoch receivers are closed or disconnected.
TrackingId:4eca56f7-0d3d-435f-a5e8-d37d3f18ddc2_B43, SystemTracker:some-hub:eventhub:some-hub~32766, Timestamp:2021-03-09T20:37:50 Reference:961669b3-7b1b-4a3f-b90c-9f76e0fc1b54, TrackingId:625575eb-b666-4ac9-ac33-e292ec77a9f9_B43, SystemTracker:some-hub:eventhub:some-hub~32766|$default, Timestamp:2021-03-09T20:37:50 TrackingId:bf8ef3cf525b45d98af89b561e240988_G25, SystemTracker:gateway5, Timestamp:2021-03-09T20:37:50, errorContext[NS: some-hub.servicebus.windows.net, PATH: some-hub/ConsumerGroups/$Default/Partitions/1, REFERENCE_ID: ab89da_988_G25_1615322270154, PREFETCH_COUNT: 300, LINK_CREDIT: 0, PREFETCH_Q_LEN: 0])
robbavey commented 3 years ago

@maris-jurgenbergs I'd like to dive a little deeper into this. Can you give us some more information:

maris-jurgenbergs commented 3 years ago

@robbavey About regularity i checked the logs and some errors are not appearing and some are regular still (logs are from 29th March ~20:00 till 30th March ~07:12).

throwing away zombie pump error was not found anymore, this one seems to be the one that is not a regular error.

Transient exceptions error was not found anymore, this one seems to be the one that is not a regular error.

Failure updating checkpoint is regular:

Line 61: 2021-03-29T20:04:19.474444995Z [2021-03-29T20:04:19,473][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 1: Failure updating checkpoint
Line 209: 2021-03-30T00:26:10.226013250Z [2021-03-30T00:26:10,225][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 0: Failure updating checkpoint
Line 357: 2021-03-30T00:26:13.250360156Z [2021-03-30T00:26:13,249][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 0: Failure updating checkpoint
Line 505: 2021-03-30T00:26:16.254747600Z [2021-03-30T00:26:16,253][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 1: Failure updating checkpoint
Line 653: 2021-03-30T03:00:21.106969931Z [2021-03-30T03:00:21,106][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 1: Failure updating checkpoint
Line 801: 2021-03-30T03:00:24.145414175Z [2021-03-30T03:00:24,144][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 1: Failure updating checkpoint
Line 949: 2021-03-30T03:00:27.150838231Z [2021-03-30T03:00:27,149][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 0: Failure updating checkpoint
Line 1097: 2021-03-30T03:10:52.876518821Z [2021-03-30T03:10:52,875][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 0: Failure updating checkpoint
Line 1393: 2021-03-30T03:10:55.921582021Z [2021-03-30T03:10:55,920][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 0: Failure updating checkpoint
Line 1541: 2021-03-30T03:10:58.926171658Z [2021-03-30T03:10:58,925][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 1: Failure updating checkpoint
Line 1689: 2021-03-30T03:31:20.932470064Z [2021-03-30T03:31:20,930][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 1: Failure updating checkpoint
Line 1837: 2021-03-30T03:31:23.957576919Z [2021-03-30T03:31:23,956][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 1: Failure updating checkpoint
Line 1985: 2021-03-30T03:31:26.961625942Z [2021-03-30T03:31:26,960][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 0: Failure updating checkpoint
Line 2133: 2021-03-30T04:01:20.913197990Z [2021-03-30T04:01:20,912][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 0: Failure updating checkpoint
Line 2281: 2021-03-30T04:01:23.938102725Z [2021-03-30T04:01:23,936][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 0: Failure updating checkpoint
Line 2429: 2021-03-30T04:01:26.941896427Z [2021-03-30T04:01:26,941][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 1: Failure updating checkpoint
Line 2577: 2021-03-30T04:17:50.999614850Z [2021-03-30T04:17:50,984][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 1: Failure updating checkpoint
Line 2725: 2021-03-30T04:17:54.045863465Z [2021-03-30T04:17:54,045][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 1: Failure updating checkpoint
Line 3021: 2021-03-30T04:17:57.051245297Z [2021-03-30T04:17:57,050][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 0: Failure updating checkpoint
Line 3169: 2021-03-30T04:45:13.444860276Z [2021-03-30T04:45:13,444][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 0: Failure updating checkpoint
Line 3317: 2021-03-30T04:45:16.509612584Z [2021-03-30T04:45:16,508][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 1: Failure updating checkpoint
Line 3465: 2021-03-30T04:45:19.534149947Z [2021-03-30T04:45:19,533][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 1: Failure updating checkpoint
Line 3613: 2021-03-30T06:54:13.524280222Z [2021-03-30T06:54:13,523][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 0: Failure updating checkpoint
Line 3761: 2021-03-30T06:54:16.568499221Z [2021-03-30T06:54:16,567][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 0: Failure updating checkpoint
Line 3909: 2021-03-30T06:54:19.572635984Z [2021-03-30T06:54:19,571][WARN ][com.microsoft.azure.eventprocessorhost.AzureStorageCheckpointLeaseManager][some-pipeline][be5beb4e66e8dfc17d1aa030b49d1865b73ad30bc4db671ddfd5119aa2125959] host logstash-some-guid: 1: Failure updating checkpoint

The update error is caused by StorageException timeouts: Caused by: com.microsoft.azure.eventprocessorhost.ExceptionWithAction: com.microsoft.azure.storage.StorageException: The client could not finish the operation within specified maximum execution timeout.

higher epoch error was not found anymore, this one seems to be the one that is not a regular error.

We are using elastic cloud and this is the kubernetes yaml with logstash config and other parts:

replicas: 1

logstashConfig:
  logstash.yml: |
    node.name: somenodename
    http.host: 0.0.0.0
    http.port: 9600
    xpack.monitoring.enabled: true
    xpack.monitoring.elasticsearch.hosts: ['https://somehost.westeurope.azure.elastic-cloud.com:someport/']
    xpack.monitoring.elasticsearch.username: '${USR}'
    xpack.monitoring.elasticsearch.password: '${PW}'
    ###xpack.monitoring.elasticsearch.ssl.certificate_authority: /usr/some.crt

  pipelines.yml: |

    - pipeline.id: some-pipeline-id
      path.config: "/somepath/config.conf"
      pipeline.workers: 2
      pipeline.batch.size: 300

# Allows you to add any pipeline files in /usr/share/logstash/pipeline/
### ***warn*** there is a hardcoded logstash.conf in the image, override it first
logstashPipeline:

  some-pipeline.conf: |
    input {
      azure_event_hubs {
        event_hub_connections => ["Endpoint=sb://someeventhub.servicebus.windows.net/;SharedAccessKeyName=Listen;SharedAccessKey=${KEY};EntityPath=hub-name"]
        threads => 2
        storage_connection => "DefaultEndpointsProtocol=https;AccountName=somestorage;AccountKey=${SOMEKEY};EndpointSuffix=core.windows.net"
        checkpoint_interval => 60
        max_batch_size => 300
      }
    }
    filter {
      json {
        source => "message"
      }
      date {
        match => ["[header][timestamp]", "ISO8601"]
        remove_field => ["[header][timestamp]"]
      }
      if [header][pri][severity] == 7 {
        mutate {add_field => {"[@metadata][es_suffix]" => "-debug"}}
      }
      else {
        mutate {add_field => {"[@metadata][es_suffix]" => ""}}
      }
      if [header][pri][severity] == 8 {
        drop {}
      }
      mutate {
        remove_field => [ "message" ]
      }
    }
    output {
      elasticsearch {
        hosts => 'https://somehost.westeurope.azure.elastic-cloud.com:someport/'
        ssl => true
        user => '${USR}'
        password => '${PW}'
        index => 'indexname'
      }
    }

image: "someimagename"
imageTag: "7.10.0"
imagePullPolicy: "IfNotPresent"
imagePullSecrets: []

logstashJavaOpts: "-Xmx1g -Xms1g"

resources:
  requests:
    cpu: "100m"
    memory: "1536Mi"
  limits:
    cpu: "1000m"
    memory: "1536Mi"

volumeClaimTemplate:
  accessModes: [ "ReadWriteOnce" ]
  resources:
    requests:
      storage: 1Gi

rbac:
  create: false
  serviceAccountAnnotations: {}
  serviceAccountName: ""

podSecurityPolicy:
  create: false
  name: ""
  spec:
    privileged: true
    fsGroup:
      rule: RunAsAny
    runAsUser:
      rule: RunAsAny
    seLinux:
      rule: RunAsAny
    supplementalGroups:
      rule: RunAsAny
    volumes:
      - secret
      - configMap
      - persistentVolumeClaim

persistence:
  enabled: false
  annotations: {}

extraVolumes: ""
  # - name: extras
  #   emptyDir: {}

extraVolumeMounts: ""
  # - name: extras
  #   mountPath: /usr/share/extras
  #   readOnly: true

extraContainers: ""
  # - name: do-something
  #   image: busybox
  #   command: ['do', 'something']

extraInitContainers: ""
  # - name: do-something
  #   image: busybox
  #   command: ['do', 'something']

# This is the PriorityClass settings as defined in
# https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass
priorityClassName: ""

# By default this will make sure two pods don't end up on the same node
# Changing this to a region would allow you to spread pods across regions
antiAffinityTopologyKey: "kubernetes.io/hostname"

# Hard means that by default pods will only be scheduled if there are enough nodes for them
# and that they will never end up on the same node. Setting this to soft will do this "best effort"
antiAffinity: "hard"

# This is the node affinity settings as defined in
# https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#node-affinity-beta-feature
nodeAffinity: {}

# The default is to deploy all pods serially. By setting this to parallel all pods are started at
# the same time when bootstrapping the cluster
podManagementPolicy: "Parallel"

httpPort: 9600

# Custom ports to add to logstash
extraPorts: []
  # - name: beats
  #   containerPort: 5001

updateStrategy: RollingUpdate

# This is the max unavailable setting for the pod disruption budget
# The default value of 1 will make sure that kubernetes won't allow more than 1
# of your pods to be unavailable during maintenance
maxUnavailable: 1

podSecurityContext:
  fsGroup: 1000
  runAsUser: 1000

securityContext:
  capabilities:
    drop:
    - ALL
  # readOnlyRootFilesystem: true
  runAsNonRoot: true
  runAsUser: 1000

# How long to wait for logstash to stop gracefully
terminationGracePeriod: 120

# Probes
# Default probes are using `httpGet` which requires that `http.host: 0.0.0.0` is part of
# `logstash.yml`. If needed probes can be disabled or overrided using the following syntaxes:
#
# disable livenessProbe
# livenessProbe: null
#
# replace httpGet default readinessProbe by some exec probe
# readinessProbe:
#   httpGet: null
#   exec:
#     command:
#       - curl
#      - localhost:9600

livenessProbe:
  httpGet:
    path: /
    port: http
  initialDelaySeconds: 300
  periodSeconds: 10
  timeoutSeconds: 5
  failureThreshold: 3
  successThreshold: 1

readinessProbe:
  httpGet:
    path: /
    port: http
  initialDelaySeconds: 60
  periodSeconds: 10
  timeoutSeconds: 5
  failureThreshold: 3
  successThreshold: 3

## Use an alternate scheduler.
## ref: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/
##
schedulerName: ""

nodeSelector: {}
tolerations: []

nameOverride: ""
fullnameOverride: ""

lifecycle: {}
  # preStop:
  #   exec:
  #     command: ["/bin/sh", "-c", "echo '10.162.34.205' >> /etc/hosts"]
  # postStart:
  #  exec:
  #     command: ["/bin/sh", "-c", "echo Hello from the postStart handler > /usr/share/message"]

service:
  annotations: {}
  type: ClusterIP
  ports:
    - name: logstash-logstash
      port: 9600
      protocol: TCP
      targetPort: 9600
#    - name: http
#      port: 8080
#      protocol: TCP
#      targetPort: 8080

ingress:
  enabled: true
  annotations: 
    kubernetes.io/ingress.class: internal-nginx
  hosts:
    - host: somehost
      paths:
        - path: /
          servicePort: 9600
        #- path: /logs
        #  servicePort: 8080
  tls:
  - hosts: 
    - somehost

One instance of logstash is running.

qaiserali commented 2 years ago

Hi,

We are also getting the same warnings with our logstash event hub. Any idea how to fix this?

sujithms commented 2 years ago

Hi,

We are noticing frequent Link detach errors and connection inactive timeouts in Logstash logs. This is failing logstash to read from specific partitions in Event Hub. Can you please suggest a workaround or fix for this issue?

Error Logs

  1. errorDescription[The connection was inactive for more than the allowed 60000 milliseconds and is closed by container 'LinkTracker'. TrackingId:c1bd39c330e64cc5a2489bdc72efcc6d_G5, SystemTracker:gateway7, Timestamp:2022-06-28T23:53:57]

  2. [2022-06-28T23:50:17,290][WARN ][com.microsoft.azure.eventhubs.impl.MessageReceiver][pipeline][alfred-logger] clientId[PR_339be5_1656001447373_MF_45774c_1656001447210-InternalReceiver], receiverPath[alfred-logging/ConsumerGroups/logstash/Partitions/9], linkName[LN_ec3fc7_1656458802212_b68_G21], onError: com.microsoft.azure.eventhubs.EventHubException: com.microsoft.azure.eventhubs.impl.AmqpException: The link 'LN_ec3fc7_1656458802212_b68_G21' is force detached. Code: RenewToken. Details: Unauthorized access. 'Listen' claim(s) are required to perform this operation. Resource: 'sb://ehn-test-01.servicebus.windows.net/alfred-logging/consumergroups/logstash/partitions/9'.. TrackingId:ad27470d0fc34ebaaa4f8826c23d6b68_G21, SystemTracker:gateway7, Timestamp:2022-06-28T23:51:41