resmoio / kubernetes-event-exporter

Export Kubernetes events to multiple destinations with routing and filtering
Apache License 2.0
753 stars 149 forks source link

include update events for event object #168

Open Teng-Jiao-Chen opened 5 months ago

Teng-Jiao-Chen commented 5 months ago

Summary

LinkedIn identified an issue where certain events were being overlooked. These were events with messages identical to previous ones. In these instances, instead of creating new events, the system simply updated existing ones by increasing their count. Consequently, without tracking these updates, we missed these events.

An event with the same message as previous means something happened in the cluster, and this event is supposed to be forwarded.

Testing Done

unit tests ``` $ go test -cover -mod=mod -v ./... ? github.com/resmoio/kubernetes-event-exporter [no test files] === RUN TestSimpleWriter --- PASS: TestSimpleWriter (0.00s) === RUN TestCorrectnessManyTimes --- PASS: TestCorrectnessManyTimes (0.13s) === RUN TestLargerThanBatchSize --- PASS: TestLargerThanBatchSize (0.00s) === RUN TestSimpleInterval --- PASS: TestSimpleInterval (0.06s) === RUN TestIntervalComplex --- PASS: TestIntervalComplex (0.06s) === RUN TestIntervalComplexAfterFlush --- PASS: TestIntervalComplexAfterFlush (0.06s) === RUN TestRetry --- PASS: TestRetry (0.20s) PASS github.com/resmoio/kubernetes-event-exporter/pkg/batch coverage: 100.0% of statements ok github.com/resmoio/kubernetes-event-exporter/pkg/batch 0.660s coverage: 100.0% of statements ? github.com/resmoio/kubernetes-event-exporter/pkg/metrics [no test files] ? github.com/resmoio/kubernetes-event-exporter/pkg/version [no test files] === RUN Test_ParseConfig --- PASS: Test_ParseConfig (0.00s) === RUN TestValidate_IsCheckingMaxEventAgeSeconds_WhenNotSet {"level":"info","time":"2024-03-11T19:37:55-07:00","message":"setting config.maxEventAgeSeconds=5 (default)"} {"level":"warn","time":"2024-03-11T19:37:55-07:00","message":"metrics name prefix is empty, setting config.metricsNamePrefix='event_exporter_' is recommended"} --- PASS: TestValidate_IsCheckingMaxEventAgeSeconds_WhenNotSet (0.00s) === RUN TestValidate_IsCheckingMaxEventAgeSeconds_WhenThrottledPeriodSet --- PASS: TestValidate_IsCheckingMaxEventAgeSeconds_WhenThrottledPeriodSet (0.00s) === RUN TestValidate_IsCheckingMaxEventAgeSeconds_WhenMaxEventAgeSecondsSet --- PASS: TestValidate_IsCheckingMaxEventAgeSeconds_WhenMaxEventAgeSecondsSet (0.00s) === RUN TestValidate_IsCheckingMaxEventAgeSeconds_WhenMaxEventAgeSecondsAndThrottledPeriodSet --- PASS: TestValidate_IsCheckingMaxEventAgeSeconds_WhenMaxEventAgeSecondsAndThrottledPeriodSet (0.00s) === RUN TestValidate_MetricsNamePrefix_WhenEmpty --- PASS: TestValidate_MetricsNamePrefix_WhenEmpty (0.00s) === RUN TestValidate_MetricsNamePrefix_WhenValid --- PASS: TestValidate_MetricsNamePrefix_WhenValid (0.00s) === RUN TestValidate_MetricsNamePrefix_WhenInvalid --- PASS: TestValidate_MetricsNamePrefix_WhenInvalid (0.00s) === RUN TestSetDefaults --- PASS: TestSetDefaults (0.00s) === RUN TestEngineNoRoutes --- PASS: TestEngineNoRoutes (0.00s) === RUN TestEngineSimple --- PASS: TestEngineSimple (0.00s) === RUN TestEngineDropSimple --- PASS: TestEngineDropSimple (0.00s) === RUN TestEmptyRoute --- PASS: TestEmptyRoute (0.00s) === RUN TestBasicRoute --- PASS: TestBasicRoute (0.00s) === RUN TestDropRule --- PASS: TestDropRule (0.00s) === RUN TestSingleLevelMultipleMatchRoute --- PASS: TestSingleLevelMultipleMatchRoute (0.00s) === RUN TestSubRoute --- PASS: TestSubRoute (0.00s) === RUN TestSubSubRoute --- PASS: TestSubSubRoute (0.00s) === RUN TestSubSubRouteWithDrop --- PASS: TestSubSubRouteWithDrop (0.00s) === RUN Test_GHIssue51 --- PASS: Test_GHIssue51 (0.00s) === RUN TestEmptyRule --- PASS: TestEmptyRule (0.00s) === RUN TestBasicRule --- PASS: TestBasicRule (0.00s) === RUN TestBasicNoMatchRule --- PASS: TestBasicNoMatchRule (0.00s) === RUN TestBasicRegexRule --- PASS: TestBasicRegexRule (0.00s) === RUN TestLabelRegexRule --- PASS: TestLabelRegexRule (0.00s) === RUN TestOneLabelMatchesRule --- PASS: TestOneLabelMatchesRule (0.00s) === RUN TestOneLabelDoesNotMatchRule --- PASS: TestOneLabelDoesNotMatchRule (0.00s) === RUN TestTwoLabelMatchesRule --- PASS: TestTwoLabelMatchesRule (0.00s) === RUN TestTwoLabelRequiredRule --- PASS: TestTwoLabelRequiredRule (0.00s) === RUN TestTwoLabelRequiredOneMissingRule --- PASS: TestTwoLabelRequiredOneMissingRule (0.00s) === RUN TestOneAnnotationMatchesRule --- PASS: TestOneAnnotationMatchesRule (0.00s) === RUN TestOneAnnotationDoesNotMatchRule --- PASS: TestOneAnnotationDoesNotMatchRule (0.00s) === RUN TestTwoAnnotationsMatchesRule --- PASS: TestTwoAnnotationsMatchesRule (0.00s) === RUN TestTwoAnnotationsRequiredOneMissingRule --- PASS: TestTwoAnnotationsRequiredOneMissingRule (0.00s) === RUN TestComplexRuleNoMatch --- PASS: TestComplexRuleNoMatch (0.00s) === RUN TestComplexRuleMatches --- PASS: TestComplexRuleMatches (0.00s) === RUN TestComplexRuleAnnotationsNoMatch --- PASS: TestComplexRuleAnnotationsNoMatch (0.00s) === RUN TestComplexRuleMatchesRegexp --- PASS: TestComplexRuleMatchesRegexp (0.00s) === RUN TestComplexRuleNoMatchRegexp --- PASS: TestComplexRuleNoMatchRegexp (0.00s) === RUN TestMessageRegexp --- PASS: TestMessageRegexp (0.00s) === RUN TestCount --- PASS: TestCount (0.00s) PASS github.com/resmoio/kubernetes-event-exporter/pkg/exporter coverage: 68.9% of statements ok github.com/resmoio/kubernetes-event-exporter/pkg/exporter 1.060s coverage: 68.9% of statements === RUN TestEnhancedEvent_DeDot === RUN TestEnhancedEvent_DeDot/nothing === RUN TestEnhancedEvent_DeDot/dedot --- PASS: TestEnhancedEvent_DeDot (0.00s) --- PASS: TestEnhancedEvent_DeDot/nothing (0.00s) --- PASS: TestEnhancedEvent_DeDot/dedot (0.00s) === RUN TestEnhancedEvent_DeDot_MustNotAlternateOriginal --- PASS: TestEnhancedEvent_DeDot_MustNotAlternateOriginal (0.00s) === RUN TestEventWatcher_EventAge_whenEventCreatedBeforeStartup --- PASS: TestEventWatcher_EventAge_whenEventCreatedBeforeStartup (0.00s) === RUN TestEventWatcher_EventAge_whenEventCreatedAfterStartupAndBeforeMaxAge --- PASS: TestEventWatcher_EventAge_whenEventCreatedAfterStartupAndBeforeMaxAge (0.00s) === RUN TestEventWatcher_EventAge_whenEventCreatedAfterStartupAndAfterMaxAge --- PASS: TestEventWatcher_EventAge_whenEventCreatedAfterStartupAndAfterMaxAge (0.00s) === RUN TestOnEvent_WithObjectMetadata --- PASS: TestOnEvent_WithObjectMetadata (0.00s) === RUN TestOnEvent_DeletedObjects --- PASS: TestOnEvent_DeletedObjects (0.00s) PASS github.com/resmoio/kubernetes-event-exporter/pkg/kube coverage: 29.3% of statements ok github.com/resmoio/kubernetes-event-exporter/pkg/kube 0.416s coverage: 29.3% of statements === RUN Test_ParseConfigFromBytes_ExampleConfigIsCorrect --- PASS: Test_ParseConfigFromBytes_ExampleConfigIsCorrect (0.01s) === RUN Test_ParseConfigFromBytes_NoErrors --- PASS: Test_ParseConfigFromBytes_NoErrors (0.00s) === RUN Test_ParseConfigFromBytes_ErrorWhenCurlyBracesNotEscaped --- PASS: Test_ParseConfigFromBytes_ErrorWhenCurlyBracesNotEscaped (0.00s) === RUN Test_ParseConfigFromBytes_OkWhenCurlyBracesEscaped --- PASS: Test_ParseConfigFromBytes_OkWhenCurlyBracesEscaped (0.00s) === RUN Test_ParseConfigFromBytes_ErrorErrorNotWithCurlyBraces --- PASS: Test_ParseConfigFromBytes_ErrorErrorNotWithCurlyBraces (0.00s) PASS github.com/resmoio/kubernetes-event-exporter/pkg/setup coverage: 100.0% of statements ok github.com/resmoio/kubernetes-event-exporter/pkg/setup 2.045s coverage: 100.0% of statements === RUN TestOpsCenterSink_Send === RUN TestOpsCenterSink_Send/Simple_Create === RUN TestOpsCenterSink_Send/Invalid_Priority:_Want_err --- PASS: TestOpsCenterSink_Send (0.00s) --- PASS: TestOpsCenterSink_Send/Simple_Create (0.00s) --- PASS: TestOpsCenterSink_Send/Invalid_Priority:_Want_err (0.00s) === RUN TestTeams_Send --- PASS: TestTeams_Send (0.00s) === RUN TestTeams_Send_WhenTeamsReturnsRateLimited --- PASS: TestTeams_Send_WhenTeamsReturnsRateLimited (0.00s) === RUN TestLayoutConvert --- PASS: TestLayoutConvert (0.00s) PASS github.com/resmoio/kubernetes-event-exporter/pkg/sinks coverage: 13.4% of statements ok github.com/resmoio/kubernetes-event-exporter/pkg/sinks 1.448s coverage: 13.4% of statements ```
e2e tests with the change ``` ### create an event $ cat /tmp/event4 apiVersion: v1 count: 1 eventTime: null firstTimestamp: "2024-03-12T05:19:59Z" involvedObject: apiVersion: apps.linkedin.com/v1beta1 kind: LiDeployment name: sample-java-mp-techen-2023-training-run namespace: techen-testing resourceVersion: "182587812" uid: 3aae86fe-768c-4506-9704-eba0bcf38593 kind: Event lastTimestamp: "2024-03-12T05:19:59Z" message: 'TJ is testing 2024-03-11 techen-testing ns' metadata: creationTimestamp: "2024-03-12T05:19:59Z" name: techen-cmr46 namespace: techen-testing reason: CustomReason reportingComponent: lideployment-controller reportingInstance: "" source: component: lideployment-controller type: Normal $ kubectl create -f /tmp/event4 event/techen-cmr46 created ### change the count to 2 and resend the event $ kubectl replace -f /tmp/event4 event/techen-cmr46 replaced ### check the event was sent for twice $ kubectl describe ld -n techen-testing ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal CustomReason (x2 over ) lideployment-controller TJ is testing 2024-03-11 techen-testing ns ### the message is received for twice from events pod {"level":"debug","msg":"TJ is testing 2024-03-11 techen-testing ns",... "message":"Received event"} {"level":"debug","sink":"dump","event":"TJ is testing 2024-03-11 techen-testing ns","time":"2024-03-12T05:17:49Z","message":"sending event to sink"} {"level":"debug","msg":"TJ is testing 2024-03-11 techen-testing ns",...,"message":"Received event"} {"level":"debug","sink":"dump","event":"TJ is testing 2024-03-11 techen-testing ns","time":"2024-03-12T05:18:24Z","message":"sending event to sink"} ```
reverse e2e tests, i.e. without the change ``` ### create the same event for once; the log is found $ kubectl tail -n kube-logging -l app=kubernetes-event-exporter | grep 'TJ is testing' {"level":"debug","msg":"TJ is testing 2024-03-11 techen-testing ns","namespace":"techen-testing","reason":"CustomReason","involvedObject":"sample-java-mp-techen-2023-training-run","time":"2024-03-12T17:21:34Z","message":"Received event"} {"level":"debug","sink":"dump","event":"TJ is testing 2024-03-11 techen-testing ns","time":"2024-03-12T17:21:34Z","message":"sending event to sink"} ### update the same event with count incremented by one no event captured from event exporter... ```
Teng-Jiao-Chen commented 5 months ago

@mustafaakin Can I get a review for this :-)

Teng-Jiao-Chen commented 5 months ago

@mustafaakin friendly ping :-)

nhippe-ds commented 3 months ago

@mustafaakin Can I get a review for this :-)

I see the linkedin fork has this merged. Any chance this repo can also be updated? @Teng-Jiao-Chen