elastic / elastic-agent

Elastic Agent - single, unified way to add monitoring for logs, metrics, and other types of data to a host.
Other
124 stars 133 forks source link

[Flaky Test]: `TestContainerCMDWithAVeryLongStatePath` - Warning: failed to walk path /usr/share/elastic-agent: lstat /usr/share/elastic-agent: no such file or directory #5328

Open pierrehilbert opened 3 weeks ago

pierrehilbert commented 3 weeks ago

Failing test case

TestContainerCMDWithAVeryLongStatePath

Error message

agent container initialisation - chown paths                            Warning: failed to walk path /usr/share/elastic-agent: lstat /usr/share/elastic-agent: no such file or directory

Build

https://buildkite.com/elastic/elastic-agent-extended-testing/builds/1992#01916a91-37a7-42e3-9153-fca5895d28b9 https://buildkite.com/elastic/elastic-agent-extended-testing/builds/2018#01916d01-c849-4dc6-a145-fe5d243f3de2

OS

Linux

Stacktrace and notes

Error Trace:    /home/ubuntu/agent/testing/integration/container_cmd_test.go:261
            Error:          Condition never satisfied
            Test:           TestContainerCMDWithAVeryLongStatePath/no_path_set
            Messages:       Elastic-Agent did not report healthy. Agent status error: "", Agent logs
                            agent container initialisation - chown paths
                            Warning: failed to walk path /usr/share/elastic-agent: lstat /usr/share/elastic-agent: no such file or directory
elasticmachine commented 3 weeks ago

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

ycombinator commented 3 weeks ago

Looks like this test first started failing after https://github.com/elastic/elastic-agent/pull/5263 was merged which, of course, makes no sense since that PR didn't touch any code or tests.

[EDIT] Thanks @belimawr for pointing out that this test failure could be due to a change in Beats, given that the logs in the test failure are Beats logs. Looking through Beats commits around the same time as when the test first started failing to see if we can narrow things down...

ycombinator commented 3 weeks ago

Nothing jumps out in the logs or the Beats commits from around the time the test started failing. The test fails because Agent doesn't report as healthy so it might be useful, as a first step, to get the test to print out the output of elastic-agent status --output full right before the failing assertion.

ycombinator commented 3 weeks ago

The test fails because Agent doesn't report as healthy so it might be useful, as a first step, to get the test to print out the output of elastic-agent status --output full right before the failing assertion.

Debugging PR: https://github.com/elastic/elastic-agent/pull/5340