elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.63k stars 8.22k forks source link

Failing test: X-Pack Security API Integration Tests (Session Concurrent Limit).x-pack/test/security_api_integration/tests/session_concurrent_limit/cleanup·ts - security APIs - Session Concurrent Limit Session Concurrent Limit cleanup should properly clean up sessions that exceeded concurrent session limit even for multiple providers #149091

Closed kibanamachine closed 1 month ago

kibanamachine commented 1 year ago

A test failed on a tracked branch

Error: expected 6 to equal 4
    at Assertion.assert (expect.js:100:11)
    at Assertion.apply (expect.js:227:8)
    at Assertion.be (expect.js:69:22)
    at Context.<anonymous> (cleanup.ts:214:54)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)
    at Object.apply (wrap_function.js:73:16)

First failure: CI Build - main

elasticmachine commented 1 year ago

Pinging @elastic/kibana-security (Team:Security)

kibanamachine commented 1 year ago

New failure: CI Build - main

mistic commented 1 year ago

Skipped.

main: 74d9321

azasypkin commented 1 year ago

Duplicate of https://github.com/elastic/kibana/issues/149090

kibanamachine commented 1 year ago

New failure: CI Build - 8.8

kibanamachine commented 1 year ago

New failure: CI Build - 8.8

jeramysoucy commented 1 year ago

Ran another flaky test runner just to be sure, but this looks tied to a series of CI failures on Friday.

kibanamachine commented 11 months ago

New failure: CI Build - main

kibanamachine commented 11 months ago

New failure: CI Build - main

kibanamachine commented 11 months ago

New failure: CI Build - main

kibanamachine commented 11 months ago

New failure: CI Build - main

kibanamachine commented 11 months ago

New failure: CI Build - main

delanni commented 11 months ago

/skip

kibanamachine commented 11 months ago

Skipped

main: c978ef954ea68bbd5ff0b24ee7707d72b28069ee

kibanamachine commented 10 months ago

New failure: CI Build - main

azasypkin commented 10 months ago
debg Log in as test_user and SAML user 3 times each with a 0.5s delay.
proc [kibana] [2023-12-26T07:07:55.459+00:00][INFO ][plugins.security.routes] Logging in with provider "basic1" (basic) {"service":{"node":{"roles":["background_tasks","ui"]}}}
proc [kibana] [2023-12-26T07:07:55.495+00:00][WARN ][plugins.security.session.index] Attempted to create a new session, but session index or alias was missing. {"service":{"node":{"roles":["background_tasks","ui"]}}}
info [o.e.c.m.MetadataCreateIndexService] [ftr] [.kibana_security_session_1] creating index, cause [api], templates [], shards [1]/[0]

The Attempted to create a new session, but session index or alias was missing log line is expected since we delete session index before every test. In such case Kibana re-creates index and then tries to write session document again. But looking at the next log lines:

debg Waiting for all sessions to be persisted...
debg Existing sessions: {"total":{"value":5,"relation":"eq"},"max_score":1,"hits":[{"_index":".kibana_security_session_1","_id":"Xga8j6T1jZIT/vuNU+vVb7fKrF4mHvHExCAbmjsciQI=","_score":1,"_source":{"provider":{"type":"saml","name":"saml1"},"idleTimeoutExpiration":1703578076289,"lifespanExpiration":1706166476289,"createdAt":1703574476289,"usernameHash":"b634643fc458a0aeacc625c15bc11e79dcad5ac011b2a1ba4f905e555acefb23","content":"xw8kZQVIhc4oOqvLUrXmT7q1c/Bz2xxRGnMRVIbZsqkDB8Cz1Tck2wBmjKONzNj3K+VRq2eWOMloar6dZIaK6na7op0Ohtv5dmXWP7uo2ouxpZ7jmGFXkD+yQiteYVttmge3rPReW2KWbubMa0SsfYuU5xD10Ku54/exxLaNC1IoliNrs3z+vfUAs07uxmWF7hVG+B6N3Hg1J7oz8+Xd83UyfVF+wgfeWm/MaE4DMm/Ptyemih+SLxWVWvwLXMwt1uS7zjrBNMNBBImxlVSnqW/1TIp9tXop1VcDJJ1DQRyy3MH0vsGUJBD+zYuxrrsJ5Zu5Hgpv1UlyRNhX5nfSzOa5VofPSBoyIAsYwOOLfQCEh4hy5O6IihPyUOfmTDAdvOpQ9Agxak55irtBRoCkwDO+QdKl1YM3lcPvOc2U/pmhspxVQmsZgqleN7DggO9APvftuM40qspenI1/80BE"}},{"_index":".kibana_security_session_1","_id":"aKFUIugOf9EvHKMRvCMrhb73RiqtTbvL50jpurho87U=","_score":1,"_source":{"provider":{"type":"basic","name":"basic1"},"idleTimeoutExpiration":1703578076826,"lifespanExpiration":1706166476826,"createdAt":1703574476826,"usernameHash":"e9ab99ee1daa1aa2b5cac38d446ac31f555b0a0bd0cd7a7335a3a2a065635e64","content":"hxBMUaCERA7MbiMqRUV4UPxwDxAHn1vCch9RoakjEtsJvv0OECfauZvWYVx4ROQY3nhGuOkZAVKm8wiCOen8zEGIDMVCtDG/4tNxI5Cr8Yim6Ep8w9GupCUSIjKml7l5lCQxVSM6rFu13q0MRk0fj5ZlQf+zVl4SL8icEjnD6zsvqi7JfQihME2O6FGBMtV67Whls8RRGC3vJiUKyVDgnEmTTbQ1bLywu/Ekw1xVwoqMc2oZO/atJpheIzIKouU/DZ7ppTO2Aca1pq99yEbf7c7Xe3XuUlc2IY6Kr3l9/1dhCTQBntyRIRvUlN/e8/MSWhgRudNy7AAEMwQDX2NR3+Q="}},{"_index":".kibana_security_session_1","_id":"1BbgcAGNoKBcQk7ac1F+9Wrm+q+zZ7RF6VI0TVQ4GJ8=","_score":1,"_source":{"provider":{"type":"saml","name":"saml1"},"idleTimeoutExpiration":1703578076930,"lifespanExpiration":1706166476930,"createdAt":1703574476930,"usernameHash":"b634643fc458a0aeacc625c15bc11e79dcad5ac011b2a1ba4f905e555acefb23","content":"tBZzVRMYYDX4V3p6yjEMURMwi1i36iF313e5RiiSLRJjLujjC/hv9wIpfsFwbAhyBp6cOlMorfrFJu/H1vbZ/tyYO77MbYqNFX3AHeIV+Vo52RJNtb4QoTsmhV7o6gG/uqMBu+piwg+Uz19Mw85lY3ddO1T4G7NrT/rw8HmzYMhm5VIYK+QhBn/c1fasnHAAzlbyUWq7hgWC0XPOk/GtGpKcdAEq12Y1017dXqStu/1+a7XiYlzPhPGEwn7IFhuzrX3CAK0KcZZb+faUomFTAQ84y3MTtyRwGgtIlr/d08gtzXMmyIH0Kl12yML41wJTv46eFzecpzyEuMeoqEPW5M3/tgDDpb4gnF8vULpzc+AA53wBUL1W1GBr60k/WrxXDHajsu1kMFwsXcaWQIAHo/kUkkI/4RpJDBYB0HU2XpOT68brbMVd2tcdkN/gmXGicsrk/3Rjr6T2TGaPSEBJ"}},{"_index":".kibana_security_session_1","_id":"kZHYAH2xZ+v2Z4r2U/rkmGN1gOOatPqsldmfnIvZhU0=","_score":1,"_source":{"provider":{"type":"basic","name":"basic1"},"idleTimeoutExpiration":1703578077465,"lifespanExpiration":1706166477465,"createdAt":1703574477465,"usernameHash":"e9ab99ee1daa1aa2b5cac38d446ac31f555b0a0bd0cd7a7335a3a2a065635e64","content":"N+WKrs6wtW6lti5/1TFJ14Mlda4swfNXAjPIq5+XANrnyphl0GavhnPI0olqmcW3+gskYA3DEHhPhju9kOX4Bgryky0DRIk4nfWSNhvc64vf78oWtutrrvIPp8xTAs3u+QNiok5upPPoPOqoUZ+FMYRSbpGYZMsGTo9Z/QSmPD8fIQnXpfy7Gnk0S1N+/yS/saaYozzsZeNj+RAjolS+XR6sFSzaI6JgsA+dTi095Iw+FzNRbfOl9Xdllh+P3fB9q0vZT0rq1Ky9uZYsQMriLVD8dLra/M5D9xjNGUqjtNE5wM4D+24sWkwF3sXakfPJXneaBOeXStUJ80ay/o++L4M="}},{"_index":".kibana_security_session_1","_id":"b8DspPMDuxmrqDB+yWpkFlsvu6pnA05b28jdK22dL7k=","_score":1,"_source":{"provider":{"type":"saml","name":"saml1"},"idleTimeoutExpiration":1703578077632,"lifespanExpiration":1706166477632,"createdAt":1703574477632,"usernameHash":"b634643fc458a0aeacc625c15bc11e79dcad5ac011b2a1ba4f905e555acefb23","content":"M8WgaUJpvc7fGzXo/v7MfwdG2foK3BBLd/9RMkBcLlAcEAXkLPMg73/LkYdFLjMng8TfsmGOC8BAd8tlRT9/QJfWscE6JpG3Bl/3WUrIsPA5TwrkJVxqvCSOt5WWJL+QBHZX9oaR8rjGbiV8O8rWE9R3cMqNoBGYOFGC++vYlh+Muzfdh4dj/wVE+VBQNyJ5q2ZbccdrC4BD00fan+rZ665aU4DD85CDw1quzLZCMWnJqoYSNL6bJfLEaiI1MqV6ouCAhuCGMpSc3tUqFXLP4BMcBibaqoEih8RajGFZqrIhIBhF0GbOwhCpsEF9Vdg5XO0IzvkuBLNIp506pnsmu7bQFszxEArsqZyPNdodFlMsfMNhVwNeAcBxWJX76wlllOM5BfcVrVam5cuo8tbA8+yCZckU+PNJ5sXNZiqZQ9TVWYffJKiM9zu99TA3c4eSzVqpNOAqdSuXfwYFvI2p"}}]}.
debg --- retry.tryForTime error: expected 5 to equal 6

we can see that the oldest session (basic) is still missing meaning that our second attempt to write the session right after we created index didn't work for some reason:

{
  "total": { "value": 5 },
  "hits": [
    {
      "_index": ".kibana_security_session_1",
      "_source": {
        "provider": { "type": "saml", "name": "saml1" },
        "createdAt": 1703574476289,
      }
    },
    {
      "_index": ".kibana_security_session_1",
      "_source": {
        "provider": { "type": "basic", "name": "basic1" },
        "createdAt": 1703574476826,
      }
    },
    {
      "_index": ".kibana_security_session_1",
      "_source": {
        "provider": { "type": "saml", "name": "saml1" },
        "createdAt": 1703574476930,
      }
    },
    {
      "_index": ".kibana_security_session_1",
      "_source": {
        "provider": { "type": "basic", "name": "basic1" },
        "createdAt": 1703574477465,
      }
    },
    {
      "_index": ".kibana_security_session_1",
      "_source": {
        "provider": { "type": "saml", "name": "saml1" },
        "createdAt": 1703574477632,
      }
    }
  ]
}

This is the code that does this https://github.com/elastic/kibana/blob/392d53c9ba6b47026a1c66c941cd31ba9c6a7426/x-pack/plugins/security/server/session_management/session_index.ts#L301-L309. For the second attempt we use ignore404: false and the logs don't show that this request failed, and from the Kibana perspective everything seems to be functioning as expected.

I'm going to add more log lines around this code to make sure we get statusCode === 200 when we try to write the session document for the second time right after we created index.

azasypkin commented 10 months ago

So, I've run these tests (with additional logging) via the Flaky Test Runner 590 times without any failures (x100 + x490). It doesn't seem like this failure is widespread enough to warrant adding more logging to the codebase, but if it continues, we'll address it. Closing the issue for now as not actionable.

kibanamachine commented 10 months ago

New failure: CI Build - 8.12

kibanamachine commented 10 months ago

New failure: CI Build - main

kibanamachine commented 10 months ago

New failure: CI Build - main

kibanamachine commented 9 months ago

New failure: CI Build - main

kibanamachine commented 9 months ago

New failure: CI Build - main

kibanamachine commented 9 months ago

New failure: CI Build - main

kibanamachine commented 9 months ago

New failure: CI Build - main

jeramysoucy commented 9 months ago

Closing to see what difference https://github.com/elastic/kibana/pull/175352 makes.

kibanamachine commented 8 months ago

New failure: CI Build - 8.13

kibanamachine commented 7 months ago

New failure: CI Build - 8.12

kibanamachine commented 7 months ago

New failure: CI Build - main

kibanamachine commented 7 months ago

New failure: CI Build - 8.13

kibanamachine commented 7 months ago

New failure: CI Build - main

kibanamachine commented 6 months ago

New failure: CI Build - main

kibanamachine commented 6 months ago

New failure: CI Build - main

kibanamachine commented 6 months ago

New failure: kibana-on-merge - 8.13

kibanamachine commented 6 months ago

New failure: kibana-on-merge - 8.13

kibanamachine commented 6 months ago

New failure: kibana-elasticsearch-snapshot-verify - 8.13

kibanamachine commented 5 months ago

New failure: kibana-on-merge - main

legrego commented 5 months ago

latest failure:

└-> should properly clean up sessions that exceeded concurrent session limit even for multiple providers
--
  | └-> "before each" hook: global before each for "should properly clean up sessions that exceeded concurrent session limit even for multiple providers"
  | └-> "before each" hook for "should properly clean up sessions that exceeded concurrent session limit even for multiple providers"
  | └- ✖ fail: security APIs - Session Concurrent Limit Session Concurrent Limit cleanup should properly clean up sessions that exceeded concurrent session limit even for multiple providers
  | │      Error: retry.tryForTime reached timeout 20000 ms
  | │ Error: expected 5 to equal 6
  | │     at Assertion.assert (expect.js:100:11)
  | │     at Assertion.apply (expect.js:227:8)
  | │     at Assertion.be (expect.js:69:22)
  | │     at cleanup.ts:235:56
  | │     at processTicksAndRejections (node:internal/process/task_queues:95:5)
  | │     at runAttempt (retry_for_success.ts:29:15)
  | │     at retryForSuccess (retry_for_success.ts:98:21)
  | │     at RetryService.tryForTime (retry.ts:37:12)
  | │     at Context.<anonymous> (cleanup.ts:234:7)
  | │     at Object.apply (wrap_function.js:73:16)
  | │       at onFailure (retry_for_success.ts:17:9)
  | │       at retryForSuccess (retry_for_success.ts:84:7)
  | │       at RetryService.tryForTime (retry.ts:37:12)
  | │       at Context.<anonymous> (cleanup.ts:234:7)
  | │       at Object.apply (wrap_function.js:73:16)
kibanamachine commented 5 months ago

New failure: kibana-on-merge - main

kibanamachine commented 4 months ago

New failure: kibana-on-merge - main

elena-shostak commented 3 months ago

Came up on interesting thing, the cleanupInterval is set to 5h in FTR config:

--xpack.security.session.cleanupInterval=5h

The log shows us the we invoke the cleanup task in the very first test and get 500.

[00:00:03]         │ debg Existing sessions: {"total":{"value":3,"relation":"eq"},"max_score":1,"hits":[{"_index":".kibana_security_session_1","_id":"gyCzdu4JDUUi03Cd2G+daZuJQKTEjvCidLFgsm6sgA4=","_score":1,"_source":{"provider":{"type":"basic","name":"basic1"},"idleTimeoutExpiration":1720528171722,"lifespanExpiration":1723116571722,"createdAt":1720524571722,"usernameHash":"e9ab99ee1daa1aa2b5cac38d446ac31f555b0a0bd0cd7a7335a3a2a065635e64","content":"M1k1w9v840N6u1EZkfwVVjzwLk9udSESi9jwRJ1znBPz0EO4RAliKE53gVyL99Zr1HpUAIIstiRtTbPIF0cVjoHmp/Z+kHeGxNAB2FbFecBjo7zPpxL6VnOr/7E3qNevj/8/sBObC4rW7l3kkDFV7AgSqWY9S64JazbqmpnH8mG18wE36L/hH4P0gTBmYCOTK35u83ajRoGFrr1Mbtkm8GEowxCKuRgkOAie2CO6btZ0KUGVHv6CiF4P9WKZnGIf2fyOS7YG7rFx98oWAs4OZhRm+/1CDU+4Q5x4L8e4Z3/fvSiah+15+1rHUzG3PNE9RQBtaGfFpxr7mo8KNCDTzvg="}},{"_index":".kibana_security_session_1","_id":"ED/Teu0G+MXg5zFFw63f4IGhJYNDQiLTku9RFKU73ZQ=","_score":1,"_source":{"provider":{"type":"basic","name":"basic1"},"idleTimeoutExpiration":1720528172475,"lifespanExpiration":1723116572475,"createdAt":1720524572475,"usernameHash":"e9ab99ee1daa1aa2b5cac38d446ac31f555b0a0bd0cd7a7335a3a2a065635e64","content":"SVDq4yk0ZZV76/00timLNRKDOnVQrZLYlsPxBX+gbHR8F/36smopXI+tS4MXgqQL4fAmVmrScmA69r/O4y5B3bHNPY1yqTYx9Zomel6bfZ3dsjQScHCrwKOLVUob4+Hn92FvT289OUMncks1OwQg5S5bET06Nd4jYhd7C5kKUzsH+vHDJV7mgMmBqRuf8m1rNUUge3XO4ra19Uil8Ou6qDGJXJRPbYOZWHoq2ey+Nr6h5j8zUPfr/wo5PiTSWdNw1PJG/MXOQiBHRi4hbVTw2ZIFIlxi1ARh+/D2aV5KbziAmxeCHu9o/1g6YJrUAxGNI9Qtd9Po5SY53sGmZ4mBF6U="}},{"_index":".kibana_security_session_1","_id":"/zR3wzCyms2bRib56OUNOn6WSHMnKbM2VjNtyBZrpJw=","_score":1,"_source":{"provider":{"type":"basic","name":"basic1"},"idleTimeoutExpiration":1720528173016,"lifespanExpiration":1723116573016,"createdAt":1720524573016,"usernameHash":"e9ab99ee1daa1aa2b5cac38d446ac31f555b0a0bd0cd7a7335a3a2a065635e64","content":"txDKnicnMjQu/ZP42i+W9BT5sM81jTL8VeHXYLTYfyWiQV7fPB98IJKQeWHPfeEjweGwjvHOfU/zooZANLYv81IeY5X4JlOL3O51aPTdf6a5iaSwJTi2DN3ASiOhi7OcLqabMs8csbAbzP6w9aN6O1PmMekPmdiLt/aQlTNlLWJPMuJ4drdTU6FJ+MFc4aOa4X1MfFX/brBv+AQ/fcr5Fq4oB/7EJPSoj3b8hjUeYaZ1DTu6sB381HfpXkg3OfihRdioUnTGjwaSD20HwT9c+roI9GZhmqewijXDoW+xwLyfv5JYtkxzP+zlktdPbGBZI2NUmxChUUsrMNDAbyEQ7aM="}}]}.
[00:00:03]         │ proc [kibana] [2024-07-09T11:29:33.098+00:00][ERROR][http] 500 Server Error {"http":{"response":{"status_code":500},"request":{"method":"post","path":"/session/_run_cleanup"}},"error":{"message":"Failed to run task \"session_cleanup\" as it is currently running"},"service":{"node":{"roles":["background_tasks","ui"]}}}
[00:00:03]         │ debg --- retry.tryForTime error: expected 200 "OK", got 500 "Internal Server Error"

That means the cleanup job itself was already running, which shouldn't be the case, because we set interval to 5h before and it is the very first test in out test suite and the first time we invoke session/_run_cleanup.

So if the cleanup job is running on some different interval it might corrupt sessions from the following tests even before we invoke session/_run_cleanup which leads to flaky behaviour sometimes. (2 test suites running on the same node and overriding interval?)

cc @azasypkin Probably you can shed some light on it, I don't have that much context around FTR setup in general.

kibanamachine commented 2 months ago

New failure: kibana-elasticsearch-snapshot-verify - 8.15

azasypkin commented 2 months ago

New failure: kibana-elasticsearch-snapshot-verify - 8.15

Same reason as described in https://github.com/elastic/kibana/issues/149091#issuecomment-2283473524