elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.48k stars 8.04k forks source link

Failing test: X-Pack Security API Integration Tests (Audit Log).x-pack/test/security_api_integration/tests/audit/audit_log·ts - security APIs - Audit Log Audit Log logs audit events when reading and writing saved objects #119267

Open kibanamachine opened 2 years ago

kibanamachine commented 2 years ago

A test failed on a tracked branch

SyntaxError: Unexpected end of JSON input
    at JSON.parse (<anonymous>)
    at map (test/security_api_integration/tests/audit/audit_log.ts:25:36)
    at Array.map (<anonymous>)
    at FileWrapper.readJSON (test/security_api_integration/tests/audit/audit_log.ts:25:20)
    at Context.<anonymous> (test/security_api_integration/tests/audit/audit_log.ts:52:23)
    at Object.apply (/opt/local-ssd/buildkite/builds/kb-n2-4-d324bf799483ae76/elastic/kibana-hourly/kibana/node_modules/@kbn/test/target_node/functional_test_runner/lib/mocha/wrap_function.js:87:16)

First failure: CI Build - 8.0

elasticmachine commented 2 years ago

Pinging @elastic/kibana-security (Team:Security)

jportner commented 2 years ago

These tests have been around for over a year without any issues, this looks like a transient failure. I'll close it for now, if it reopens we can investigate it further.

kibanamachine commented 2 years ago

New failure: CI Build - 8.0

jportner commented 2 years ago

This is starting to look like a pattern but I'm not sure what's causing it. I opened #120354 to give us more details in these error messages

kibanamachine commented 2 years ago

New failure: CI Build - main

jportner commented 2 years ago

New more detailed error message:

└-> logs audit events when reading and writing saved objects
--
  | └-> "before each" hook: global before each for "logs audit events when reading and writing saved objects"
  | └-> "before each" hook for "logs audit events when reading and writing saved objects"
  | └- ✖ fail: security APIs - Audit Log Audit Log logs audit events when reading and writing saved objects
  | │      Error: Failed to parse audit log JSON, error: "Unexpected end of JSON input", audit.log contents:
  | │ {"event":{"action":"http_request","category":["web"],"outcome":"unknown"},"http":{"request":{"method":"get"}},"url":{"domain":"localhost","path":"/audit_log","port":6101,"query":"query=param","scheme":"http"},"user":{"name":"elastic","roles":["superuser"]},"kibana":{"space_id":"default"},"trace":{"id":"b8cbaf03-5c1d-48b6-abd8-a0df0a5b3a85"},"ecs":{"version":"8.0.0"},"@timestamp":"2021-12-07T11:42:37.896+00:00","message":"User is requesting [/audit_log] endpoint","log":{"level":"INFO","logger":"plugins.security.audit.ecs"},"process":{"pid":12399},"transaction":{"id":"9965993cd18e923e"}}
  | │ {"event":{"action":"saved_object_create","category":["database"],"type":["creation"],"outcome":"unknown"},"kibana":{"space_id":"default","saved_object":{"type":"dashboard","id":"c6fc21b0-5752-11ec-b57d-2d2a69d33f32"}},"user":{"name":"elastic","roles":["superuser"]},"trace":{"id":"b8cbaf03-5c1d-48b6-abd8-a0df0a5b3a85"},"ecs":{"version":"8.0.0"},"@timestamp":"2021-12-07T11:42:38.013+00:00","message":"User is creating dashboard [id=c6fc21b0-5752-11ec-b57d-2d2a69d33f32]","log":{"level":"INFO","logger":"plugins.security.audit.ecs"},"process":{"pid":12399},"transaction":{"id":"9965993cd18e923e"}}
  | │ {"event":{"action":"saved_object_find","category":["database"],"type":["access"],"outcome":"success"},"kibana":{"space_id":"default","saved_object":{"type":"dashboard","id":"elastic_agent-f47f18cc-9c7d-4278-b2ea-a6dee816d395"}},"user":{"name":"elastic","roles":["superuser"]},"trace":{"id":"b8cbaf03-5c1d-48b6-abd8-a0df0a5b3a85"},"ecs":{"version":"8.0.0"},"@timestamp":"2021-12-07T11:42:39.163+00:00","message":"User has accessed dashboard [id=elastic_agent-f47f18cc-9c7d-4278-b2ea-a6dee816d395]","log":{"level":"INFO","logger":"plugins.security.audit.ecs"},"process":{"pid":12399},"transaction":{"id":"9965993cd18e923e"}}
  | │ {"event":{"action":"saved_object_find","category":["database"],"type":["access"],"outcome":"success"},"kibana":{"space_id":"default","saved_object":{"type":"dashboard","id":"system-d401ef40-a7d5-11e9-a422-d144027429da"}},"user":{"name":"elastic","roles":["superuser"]},"trace":{"id":"b8cbaf03-5c1d-48b6-abd8-a0df0a5b3a85"},"ecs":{"version":"8.0.0"},"@timestamp":"2021-12-07T11:42:39.164+00:00","message":"User has accessed dashboard [id=system-d401ef40-a7d5-11e9-a422-d144027429da]","log":{"level":"INFO","logger":"plugins.security.audit.ecs"},"process":{"pid":12399},"transaction":{"id":"9965993cd18e923e"}}
  | │ {"event":{"action":"saved_object_find","category":["database"],"type":["access"],"outcome":"success"},"kibana":{"space_id":"default","saved_object":{"type":"dashboard","id":"system-f49f3170-9ffc-11ea-87e4-49f31ec44891"}},"user":{"name":"elastic","roles":["superuser"]},"trace":{"id":"b8cbaf03-5c1d-48b6-abd8-a0df0a5b3a85"},"ecs":{"version":"8.0.0"},"@timestamp":"2021-12-07T11:42:39.164+00:00","message":"User has accessed dashboard [id=system-f49f3170-9ffc-11ea-87e4-49f31ec44891]","log":{"level":"INFO","logger":"plugins.security.audit.ecs"},"process":{"pid":12399},"transaction":{"id":"9965993cd18e923e"}}
  | │ {"event":{"action":"saved_object_find","category":["database"],"type":["access"],"outcome":"success"},"kibana":{"space_id":"default","saved_object":{"type":"dashboard","id":"system-01c54730-fee6-11e9-8405-516218e3d268"}},"user":{"name":"elastic","roles":["superuser"]},"trace":{"id":"b8cbaf03-5c1d-48b6-abd8-a0df0a5b3a85"},"ecs":{"version":"8.0.0"},"@timestamp":"2021-12-07T11:42:39.164+00:00","message":"User has accessed dashboard [id=system-01c54730-fee6-11e9-8405-516218e3d268]","log":{"level":"INFO","logger":"plugins.security.audit.ecs"},"process":{"pid":12399},"transaction":{"id":"9965993cd18e923e"}}
  | │ {"event":{"action":"saved_object_find","category":["database"],"type":["access"],"outcome":"success"},"kibana":{"space_id":"default","saved_object":{"type":"dashboard","id":"system-035846a0-a249-11e9-a422-d144027429da"}},"user":{"name":"elastic","roles":["superuser"]},"trace":{"id":"b8cbaf03-5c1d-48b6-abd8-a0df0a5b3a85"},"ecs":{"version":"8.0.0"},"@timestamp":"2021-12-07T11:42:39.165+00:00","message":"User has accessed dashboard [id=system-035846a0-a2
  | │       at FileWrapper.readJSON (test/security_api_integration/tests/audit/audit_log.ts:29:13)

The last event cuts off in the middle of the message field: {"event":{"action":"saved_object_find","category":["database"],"type":["access"],"outcome":"success"},"kibana":{"space_id":"default","saved_object":{"type":"dashboard","id":"system-035846a0-a249-11e9-a422-d144027429da"}},"user":{"name":"elastic","roles":["superuser"]},"trace":{"id":"b8cbaf03-5c1d-48b6-abd8-a0df0a5b3a85"},"ecs":{"version":"8.0.0"},"@timestamp":"2021-12-07T11:42:39.165+00:00","message":"User has accessed dashboard [id=system-035846a0-a2

jportner commented 2 years ago

From a discussion offline with some other Kibana folks:

Sounds kinda like timing issues. Buildkite introduced a bunch of new timing behaviors, perhaps some other rando commits have pushed the timing on this one over the edge. We have similar sort of tests with the eventLog for alerting - FTS is writing log entries (to an index), FTR is reading to validate the right docs got written. We have that in a retry loop, waiting a second in between, to handle the timing issues. Has worked well for us.

We already have retry logic (that waits for the file to be non-empty), I think adding a short delay between that and the read/parse step would probably alleviate our issues, even though it's not an ideal fix.

kibanamachine commented 2 years ago

New failure: CI Build - main

kibanamachine commented 2 years ago

New failure: CI Build - 8.0

mistic commented 2 years ago

This test keeps failing and looks flaky. Skipping for now.

main/8.1: 38aa34f 8.0: 97bda08

kibanamachine commented 1 year ago

New failure: CI Build - 8.6

jeramysoucy commented 1 year ago

Based on where this is failing in 8.6 (at checking the second SO operation, which is different from the first failure that triggered this issue), it almost seems like a race condition between when we detect the log file is not empty and when the file is complete. We retrieve the contents of the file as soon as we detect that the file is not empty and we even appear to make it through some of the assertions successfully.

The most ideal solution would be to determine when the file is complete, but this may not be programmatically achievable. An acceptable solution could be to add a static delay after detecting the file is not empty before reading the contents.

jeramysoucy commented 1 year ago

Closing for now, as I cannot reproduce the issue. If the same error occurs again I will add a static delay (described above), pending a better suggestion.

kibanamachine commented 1 year ago

New failure: CI Build - main

kibanamachine commented 1 year ago

New failure: CI Build - main

mistic commented 1 year ago

Skipped.

main: 22015b4

kibanamachine commented 1 year ago

New failure: CI Build - main

kibanamachine commented 1 year ago

New failure: CI Build - main

jeramysoucy commented 1 year ago

Ran another flaky test runner just to be sure, but this looks tied to a series of CI failures on friday.

kibanamachine commented 10 months ago

New failure: CI Build - main

jeramysoucy commented 8 months ago

Note: New failure is a different failure mode than the previous failures in May. May failures were part of a broader CI failure. New failure occurred after a series of index management operations:

[00:00:00] │ proc [kibana] [2023-08-22T12:01:06.131+00:00][INFO ][plugins.securitySolution.endpoint:metadata-check-transforms-task:0.0.1] no endpoint installation found [00:00:00] │ info [o.e.c.m.MetadataCreateIndexService] [ftr] [.ds-.logs-deprecation.elasticsearch-default-2023.08.22-000001] creating index, cause [initialize_data_stream], templates [.deprecation-indexing-template], shards [1]/[1] [00:00:00] │ info [o.e.c.m.MetadataCreateDataStreamService] [ftr] adding data stream [.logs-deprecation.elasticsearch-default] with write index [.ds-.logs-deprecation.elasticsearch-default-2023.08.22-000001], backing indices [], and aliases [] [00:00:00] │ info [o.e.c.r.a.AllocationService] [ftr] updating number_of_replicas to [0] for indices [.ds-.logs-deprecation.elasticsearch-default-2023.08.22-000001] [00:00:00] │ info [o.e.i.m.MapperService] [ftr] [.ds-.logs-deprecation.elasticsearch-default-2023.08.22-000001] reloading search analyzers [00:00:00] │ info [o.e.x.i.IndexLifecycleTransition] [ftr] moving index [.ds-.logs-deprecation.elasticsearch-default-2023.08.22-000001] from [null] to [{"phase":"new","action":"complete","name":"complete"}] in policy [.deprecation-indexing-ilm-policy] [00:00:00] │ info [o.e.c.r.a.AllocationService] [ftr] current.health="GREEN" message="Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.ds-.logs-deprecation.elasticsearch-default-2023.08.22-000001][0]]])." previous.health="YELLOW" reason="shards started [[.ds-.logs-deprecation.elasticsearch-default-2023.08.22-000001][0]]" [00:00:00] │ info [o.e.x.i.IndexLifecycleTransition] [ftr] moving index [.ds-.logs-deprecation.elasticsearch-default-2023.08.22-000001] from [{"phase":"new","action":"complete","name":"complete"}] to [{"phase":"hot","action":"unfollow","name":"branch-check-unfollow-prerequisites"}] in policy [.deprecation-indexing-ilm-policy] [00:00:00] │ info [o.e.c.m.MetadataMappingService] [ftr] [.ds-.logs-deprecation.elasticsearch-default-2023.08.22-000001/dhL5SvKRQIGvjlHjDP_IZQ] update_mapping [_doc] [00:00:00] │ info [o.e.x.i.IndexLifecycleTransition] [ftr] moving index [.ds-.logs-deprecation.elasticsearch-default-2023.08.22-000001] from [{"phase":"hot","action":"unfollow","name":"branch-check-unfollow-prerequisites"}] to [{"phase":"hot","action":"rollover","name":"check-rollover-ready"}] in policy [.deprecation-indexing-ilm-policy]

I could not reproduce this failure locally or with a flakey test runner. Closing as a transient failure and will watch for reoccurrence.

kibanamachine commented 7 months ago

New failure: CI Build - main

kibanamachine commented 6 months ago

New failure: CI Build - main

kibanamachine commented 6 months ago

New failure: CI Build - 8.12

kibanamachine commented 6 months ago

New failure: CI Build - main

kibanamachine commented 6 months ago

New failure: CI Build - 7.17

kibanamachine commented 6 months ago

New failure: CI Build - 7.17

kibanamachine commented 6 months ago

New failure: CI Build - main

kibanamachine commented 5 months ago

New failure: CI Build - 8.12

kibanamachine commented 5 months ago

New failure: CI Build - 7.17

kibanamachine commented 5 months ago

New failure: CI Build - main

kibanamachine commented 4 months ago

New failure: CI Build - main

kibanamachine commented 4 months ago

New failure: CI Build - main

kibanamachine commented 4 months ago

New failure: CI Build - main

kibanamachine commented 4 months ago

New failure: CI Build - main

kibanamachine commented 4 months ago

New failure: CI Build - main

mistic commented 4 months ago

Skipped.

main: b2ac0ed

kibanamachine commented 3 months ago

New failure: CI Build - 8.13