Failing test: X-Pack API Integration Tests.x-pack/test/api_integration/apis/osquery/packs·ts - apis Osquery Endpoints Packs create route should return 200 and multi line query, but single line query in packs config

elastic / kibana

Your window into the Elastic Stack

https://www.elastic.co/products/kibana

Other

20.01k stars 8.25k forks source link

Failing test: X-Pack API Integration Tests.x-pack/test/api_integration/apis/osquery/packs·ts - apis Osquery Endpoints Packs create route should return 200 and multi line query, but single line query in packs config #133259

Open kibanamachine opened 2 years ago

kibanamachine commented 2 years ago

A test failed on a tracked branch

TypeError: Cannot read properties of undefined (reading 'id')
    at Context.<anonymous> (x-pack/test/api_integration/apis/osquery/packs.ts:104:39)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at Object.apply (node_modules/@kbn/test/target_node/functional_test_runner/lib/mocha/wrap_function.js:87:16)

First failure: CI Build - 8.3

elasticmachine commented 2 years ago

Pinging @elastic/security-asset-management (Team:Asset Management)

spalger commented 2 years ago

Looks like this test was added 13 days ago by @tomsonpl in https://github.com/elastic/kibana/pull/131224

kibanamachine commented 2 years ago

New failure: CI Build - main

tomsonpl commented 2 years ago

Ouch, havent seen the message from 3 weeks ago. I am on it, gonna check what is happening. Thanks for pinging me @spalger

tomsonpl commented 2 years ago

This is strange, tests are passing locally. Do you know any other way to test this than running node scripts/functional_tests --bail --config x-pack/test/api_integration/config.ts ? Thanks!

spalger commented 2 years ago

Yeah, the test isn't failing consistently, it's flaky, which likely means there is some sort of timing issue in the way the test works. This can be tricky to find without walking through the steps of the specific test and ensuring that there aren't race conditions step by step.

Based on the failure logs from buildkite it seems that sometimes /api/fleet/package_policies can respond without an item. This API might have actually responded with an error status code, but there isn't a status code assertion at https://github.com/elastic/kibana/blob/bf6cf59908d35b3cb98146a2a88fe1c1610110d6/x-pack/test/api_integration/apis/osquery/packs.ts#L84-L103.

What I would recommend it creating a PR which adds status code assertions to that and similar API calls here, throw a .only() on the suite, and then run the api_intergation config in the flaky test runner 100 or so times. This might help explain the error that's occurring and how to avoid it from failing tests in the future.

tomsonpl commented 2 years ago

That's very helpful, thanks! I will take care of this tomorrow morning!

tomsonpl commented 2 years ago

Hey, this is still very hard to reproduce. Do you know if that FAIL happens often? Or it was just 2 times in the past month? https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/770 I tried this twice, and got good results all the time ;/

spalger commented 2 years ago

Looks like it's happening about 4 times a week across tracked branches and PRs, so it's definitely flaky enough that the next time it failed on a tracked branch it will be skipped.

spalger commented 2 years ago

Hunting for failures with the flaky test runner can be a problem with tests which aren't super flaky. I feel like you're probably better off putting in the status code assertions I was describing and all the debug info you can think of, then the next time it fails in a PR or on a tracked branch you can look into what happened. You can find the PRs which have this failure by checking out this dashboard: https://ops.kibana.dev/s/ci/app/dashboards#/view/8b7279b5-a72d-4a03-a480-fcc970f16305?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-7d%2Fd,to:now))

Just search for the name of the test to see the visualization I took a screenshot of and find out which jobs have failures you can inspect

tomsonpl commented 2 years ago

Alright, thank you @spalger :)

tomsonpl commented 2 years ago

Thank you @spalger , I merged the PR https://github.com/elastic/kibana/pull/134881 with assertion and a console.log just in case. I will continue to investigate. Sorry if this caused you and inconvenience 👍

kibanamachine commented 2 years ago

New failure: CI Build - main

kibanamachine commented 2 years ago

New failure: CI Build - 8.4

kibanamachine commented 2 years ago

New failure: CI Build - main

kibanamachine commented 2 years ago

New failure: CI Build - main

mistic commented 2 years ago

Skipped.

main: 7cae4515534