elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
111 stars 4.93k forks source link

ManagerV2 panic on stopping output unit and TestManagerV2 bad assertion #41796

Open carsonip opened 4 days ago

carsonip commented 4 days ago

Apparently TestManagerV2 assertions are not done as expected. allStopped is called at the very beginning of the test instead of at last. When attempting to fix it,

diff --git a/x-pack/libbeat/management/managerV2_test.go b/x-pack/libbeat/management/managerV2_test.go
index f1b32b15d8..767be99b97 100644
--- a/x-pack/libbeat/management/managerV2_test.go
+++ b/x-pack/libbeat/management/managerV2_test.go
@@ -70,7 +70,7 @@ func TestManagerV2(t *testing.T) {
                        if oCfg == nil && len(iCfgs) == 0 && apmCfg == nil {
                                configsCleared.Store(true)
                        }
-                       if len(observed.Units) == 0 {
+                       if len(observed.Units) == 0 && currentIdx == 3 {
                                allStopped.Store(true)
                                t.Log("output and inputs configuration cleared (stopping)")
                        }

It generates a panic:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x4 pc=0xc37369]

goroutine 16 [running]:
github.com/elastic/beats/v7/x-pack/libbeat/management.(*agentUnit).UpdateState(0x0, 0x3, {0xe26514, 0x7}, 0x0)
    /home/carson/projects/beats/x-pack/libbeat/management/unit.go:252 +0x69
github.com/elastic/beats/v7/x-pack/libbeat/management.(*BeatV2Manager).reload(0xc0002f4240, 0xc000051e58)
    /home/carson/projects/beats/x-pack/libbeat/management/managerV2.go:672 +0x9f1
github.com/elastic/beats/v7/x-pack/libbeat/management.(*BeatV2Manager).unitListen(0xc0002f4240)
    /home/carson/projects/beats/x-pack/libbeat/management/managerV2.go:548 +0x7fc
created by github.com/elastic/beats/v7/x-pack/libbeat/management.(*BeatV2Manager).Start in goroutine 7
    /home/carson/projects/beats/x-pack/libbeat/management/managerV2.go:296 +0x1e5

This is due to a missing outputUnit nil check in https://github.com/elastic/beats/blob/367d94e1dc40a82713d651735eac7921bb584474/x-pack/libbeat/management/managerV2.go#L672 . outputUnit is nil because the unit is stopping. It is unclear whether this panic is possible in real scenarios, especially in cases where output can be hot reloaded, e.g. apm-server.

There are 2 parts to this issue:

elasticmachine commented 4 days ago

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)