elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.57k stars 8.09k forks source link

[Meta] Audit Logging #52125

Closed jportner closed 2 years ago

jportner commented 4 years ago

Overview

The current state of audit logging in Kibana is not sufficient for many users' needs. Kibana outputs only a few types of events, without much detail, in the same transport as regular log messages. This can be improved in many ways.

Enhancements in scope:

Current state vs. desired state... ------ ### Current state Audit records in Kibana are displayed in plaintext like so: ``` log [23:26:50.059] [info][audit][saved_objects_authorization_success][security] jdoe authorized to get config log [23:26:50.067] [info][audit][saved_objects_authorization_success][security] jdoe authorized to find index-pattern ``` If JSON output is enabled: ``` { "type": "log", "@timestamp": "2020-02-18T14:58:44-05:00", "tags": [ "info", "audit", "security", "saved_objects_authorization_success" ], "pid": 38933, "username": "jojo", "action": "get", "types": [ "config" ], "args": { "type": "config", "id": "8.0.0", "options": {} }, "eventType": "saved_objects_authorization_success", "message": "jojo authorized to get config" } { "type": "log", "@timestamp": "2020-02-18T14:58:44-05:00", "tags": [ "info", "audit", "security", "saved_objects_authorization_success" ], "pid": 38933, "username": "jojo", "action": "find", "types": [ "index-pattern" ], "args": { "options": { "perPage": 1, "page": 1, "type": [ "index-pattern" ], "search": "*", "defaultSearchOperator": "OR", "searchFields": [ "title" ], "fields": [ "title" ] } }, "eventType": "saved_objects_authorization_success", "message": "jojo authorized to find index-pattern" } ``` ### Future state Audit records should be written in a standard format ([ECS](https://www.elastic.co/guide/en/ecs/current/index.html)), should contain more information about the event that occurred and who originated the action, and fields should be configurable to include more or less information. Such an audit record would look something like this: ``` { "@timestamp": "2019-12-05T00:00:02.000Z", "event": { "action": "get config", "category": "saved_objects_authorization", "duration": 453, "end": "2019-12-05T00:00:02.453Z", "module": "security", "outcome": "success", "start": "2019-12-05T00:00:02.000Z" }, "host": { "id": "5b2de169-2785-441b-ae8c-186a1936b17d", "ip": "34.56.78.90", "hostname": "hostname" }, "http": { "request": { "body": { "bytes": 887, "content": "Hello world" }, "bytes": 1437, "method": "get", "referrer": "https://blog.example.com/" } }, "labels": { "spaceId": "default" }, "source": { "address": "12.34.56.78", "ip": "12.34.56.78" }, "url": { "domain": "www.elastic.co", "full": "https://www.elastic.co:443/search?q=elasticsearch", "path": "/search", "port": "443", "query": "q=elasticsearch", "scheme": "https" }, "user": { "email": "john.doe@company.com", "full_name": "John Doe", "hash": "D30A5F57532A603697CCBB51558FA02CCADD74A0C499FCF9D45B...", "sid": "2FBAF28F6427B1832F2924E4C22C66E85FE96AFBDC3541C659B67...", "name": "jdoe", "roles": [ "kibana_user" ] }, "trace": { "id": "8a4f500d" } } ``` Note: in the example above, the `user.hash` (a hash of the `user.name` field) would not be included by default; it would be an optional field that could be included if the `user.name` needed to be excluded for privacy reasons. ------

First Phase

Prerequisites (in progress):

Phase 1 implementation: #54836

Future Phase

elasticmachine commented 4 years ago

Pinging @elastic/kibana-security (Team:Security)

jportner commented 4 years ago

@arisonl FYI

joshdover commented 4 years ago

I see the output format is going to be in ECS which is great. Will we support ingesting this data into Elasticsearch and using it in the product for inspection by admins? We should be able to leverage Core's logging appenders to accomplish the ingestion piece.

jportner commented 4 years ago

I see the output format is going to be in ECS which is great. Will we support ingesting this data into Elasticsearch and using it in the product for inspection by admins? We should be able to leverage Core's logging appenders to accomplish the ingestion piece.

My take on it is that the ingestion itself is out of scope for this feature. As long as we can output to JSON on the file system (which we were intending to use Core's logging appenders to do), Filebeat can be used for ingestion. Is that what you meant? Or are the logging appenders going to support ingestion directly?

joshdover commented 4 years ago

Filebeat would definitely work. It'd be interesting if we could actually ship Filebeat with Kibana configured to do this automatically. Of course there's some complexity with that as well (process monitoring, licensing, etc.)

My broader question is about whether or not there are plans to use this data in the product. For example, it'd be great if there was a menu item on an visualization that opened a UI with a history of edits to that visualization.

jportner commented 4 years ago

My broader question is about whether or not there are plans to use this data in the product. For example, it'd be great if there was a menu item on an visualization that opened a UI with a history of edits to that visualization.

In short: no. There is overlap of what information we need / what conclusions we can draw with audit logging and what we're calling "usage data". However, there is a strong separation of concerns there. We ultimately decided to keep this at a smaller scope just for the auditing use case.

I do think that once we have all of the new audit logging in place, we'll have all of the hooks/plumbing necessary to track and provide robust usage data. But we don't want to conflate audit records and usage data.

kobelb commented 4 years ago

During a Zoom meeting today, there was some discussion about which events and attributes should be in the "normal logs" vs what should be in the "audit logs". @jportner and I discussed this further and I've summarized the consensus that we reached.

The normal logs should not include user-specific information. User information is particularly sensitive, and augmenting normal log events with this information is potentially problematic. However, it's perfectly fine for these to include opaque identifiers for the session and the HTTP request. The normal logs should include all events which are logged using the standard logging infrastructure and be filtered however the user chooses.

The audit logs should include user-specific information, and controls will be put in place to only log entries for specific users or only specific user information. The audit logs will include only audit specific events. There is potentially some overlap here with regard to the events which appear in the normal logs and in the audit logs, but they're generally completely separate. The audit logs will include all authorization and authentication based events, in addition to events for specific operations of interest, including but not limited to: saved-object CRUD, Elasticsearch queries. The mechanism for creating the audit events for operations which aren't auth related needs to be explored further.

joshdover commented 4 years ago

Components needed:

Open questions:

mshustov commented 4 years ago

I see the Audit service as a separate top-level service (the outer circle in the onion architecture)

No plugins depend on the AuditTrail. AuditTrail Service may depend on any plugin. The platform and plugins emit auditable events. AuditTrail service listen to them and call plugin API to collect the necessary data.

security.on('authenticationSuccess', (message: string, request: KibanaRequest) => {
  const auditData = {
    message,
    action: 'authenticationSuccess'
    user: security.getUser(request),
    spaces: spaces.getSpace(request),
    server: core.http.getServerInfo(),
   ...
}
// has a well-known prefix
log.logger(auditData);

As an alternative, Platform provides Auditable hook and AuditTrail service registers itself via this hook.

registerAuditable(({ action: string, message: string, request: KibanaRequest }) => void): void;

To define the logging layout, we can use the same approach as elasticsearch does for SecurityAudit - add an explicit config in x-pack that enhance OSS kibana.yml config. https://github.com/elastic/elasticsearch/blob/fb86e8d6d67d95a8f2e99a175e3a6d7bbb4b196e/distribution/docker/src/docker/config/log4j2.properties#L47-L82 That would allow users to configure layout and destination as requied.

The open question for me: What type of unique data each auditable event has got? I suspect a dataset for Elasticsearch query and authentication denied events can be different. If the dataset for every auditable event is the same, we can use a common interface for Audit service. Otherwise, we might want to separate common fields from event-specific fields. AuditTrail implementation in Elasticsearch: https://github.com/elastic/elasticsearch/blob/5775ca83dbee90d3988faa611024bfaf42b13073/x-pack/plugin/security/src/main/java/org/elasticsearch/xpack/security/audit/logfile/LoggingAuditTrail.java

ECS (OSS, default JSON layout?)

Elasticsearch doesn't use the ECS type. Instead, their JSON layout follows the ECS format by default.

Does not include data about current user - we don't want this in OSS, security should add it itself (maybe we add a addScopeProvider API to the logger API?)

We already have RequestHandlerContext. It might expose addMetaData() to extend request with additional data. If we consider some data as sensitive, we shouldn't provide read access to it. The main problem with this approach that the AuditTrail plugin hasn't got control over the shape of data, but it needs to filter and to format them to a necessary layout (that differentiate it from telemetry plugin approach)

joshdover commented 4 years ago

I think we are largely on the same page here. I'd like to layout this plan with a distinction between some of the concerns. Namely, I'd like to separate what is necessary to support general observability and tracing within Kibana logs (OSS and otherwise) and what is necessary to support audit logs (X-Pack).

General observability requirements:

Audit logging requirements:


For the general observability case, we need a couple new components:

  1. Contextual data on log records that includes information about the request that initiated the log
  2. An ECS-compatible JSON log layout

I think we're both in agreement on how to accomplish these two requirements.

(1) can be solved by introducing a formal "LogContext` struct that is used by both the Logger and the Elasticsearch and SavedObjects clients. This struct would be created by Core's request context provider and injected into the ES and SO clients exposed by RequestHandlerContext. This enables every log message in those clients to include data about the current request (would not include user data).

(2) is solved by changing our JSON log layout to be ECS-compatible.


For the audit logging case, we need:

  1. A way to produce domain action events, a few options:
    • Specific emit points in the OSS code;
    • Leveraging the existing logs and translating them to domain events; or
    • A registerAuditable interface as above
  2. A way in which domain actions can be mapped back to additional context information not included in OSS
    • For most cases this is being able to call security.authc.getCurrentUser with the KibanaRequest object.

(1) is where I think we need some discussion.

My only concern about adding domain-specific events is that they may be abused by other plugins for different purposes. For example, we've gotten requests to add hooks like onDelete to the SavedObjectsClient. Having generic hooks like this can lead to a complex web of business logic that relies on these hooks executing in order to keep the system in a valid state.

I think we just need to take care in how we implement such events so that the timing of when they are executed is not depended on by business logic. In other words, I want to avoid a situation where an app is dependent on these hooks in order to function correctly (other than audit logging itself).

This makes me lean slightly towards the registerAuditable interface or something similar. I think it's much more likely that these type of events are consumed responsibly if they are exposed this way, rather than each sub-system emitting these events.

legrego commented 4 years ago

Sorry for being dense, I'm a bit confused about the proposed use of registerAuditable. @restrry's example makes me think that it would be used by the service responsible for decorating and writing the already-generated audit log events to disk, but @joshdover's comment leads me to believe that it would be used to produce the domain action events that eventually get decorated and logged downstream.


Can we outline what a couple of domain action event might look like? Let's say that both Security and Spaces are enabled:

img

View diagram markup
title Create Dashboard

User->Kibana: Create Dashboard Request
Kibana->Kibana: Unique Request identifier created
Kibana->Saved Objects Service: Create Dashboard Request
Saved Objects Service->Security SOC Wrapper: AuthN Check
Security SOC Wrapper->ES: _has_privileges request
ES->Security SOC Wrapper: _has_privileges response
Security SOC Wrapper->Saved Objects Service: OK
Saved Objects Service->ES: index { bigBlobOfJSON }
ES->Saved Objects Service: { bigBlobOfJSON }
Saved Objects Service->Kibana: { bigBlobOfJSON }
Kibana->User: Create Dashboard Response

In this example, we have 2 requests made to ES: one for the privileges check, and another to actually index the saved object. In this example, I'd expect a single "Create dashboard" audit record, as the privileges check is a simple implementation detail, which would still be captured by the ES audit logs.

What about a more complex example though? Consider the "Copy to space" feature. This works by first performing a server-side export, followed by a server-side import:

img

View diagram markup
title Copy to Space

User->Kibana: Copy to space Request
Kibana->Kibana: Unique Request identifier created
Kibana->Saved Objects Service: bulk_get objects to be copied
Saved Objects Service->Security SOC Wrapper: AuthN Check
Security SOC Wrapper->ES: _has_privileges request
ES->Security SOC Wrapper: _has_privileges response
Security SOC Wrapper->Saved Objects Service: OK
Saved Objects Service->ES: bulk_get [{type: 'search', id: 'foo'}, ...]
ES->Saved Objects Service: bulk_get response [{ bigBlobOfJSON }]
Saved Objects Service->Kibana: [{ bigBlobOfJSON }]
Kibana->Saved Objects Service: bulk_create objects to be copied
Saved Objects Service->Security SOC Wrapper: AuthN Check
Security SOC Wrapper->ES: _has_privileges request
ES->Security SOC Wrapper: _has_privileges response
Security SOC Wrapper->Saved Objects Service: OK
Saved Objects Service->ES: bulk_create [{type: 'search', id: 'foo'}, ...]
ES->Saved Objects Service: bulk_create response [{ bigBlobOfJSON }]
Saved Objects Service->Kibana: [{ bigBlobOfJSON }]
Kibana->User: Copy to space Response

How many audit records would we expect to see here? Somewhere between 1 and 3? 1) "Copy to space" record 2) "Export/ bulk_get saved objects" record 3) "Import / bulk_create saved objects" record

My initial reaction is that 2 and 3 are implementation details of 1, and therefore might not make sense in the audit log. They should show up in the general log, however. Someone trying to understand the audit logs might get confused that they're seeing bulk_get and bulk_create requests, when they in fact "only" performed a Copy to space action.

To make a comparison to the ES audit logs, I don't think they record shard read/writes that occur as part of a user's request. They log that the request happened, and the "implementation details" are kept out of the audit logs.

I only bring this up because it's not immediately clear to me where we'll choose to generate/emit these audit events. Doing so at the saved objects client would cause these "implementation details" to be logged for various domain action events. Emitting from the http routes (the public API) would probably get us most of the way there, but that doesn't handle actions like background jobs.

mshustov commented 4 years ago

In this example, we have 2 requests made to ES: one for the privileges check, and another to actually index the saved object. In this example, I'd expect a single "Create dashboard" audit record, as the privileges check is a simple implementation detail, which would still be captured by the ES audit logs.

I'd expect to see Dashboard created and Dashboard creation failed audit records in this example. Both should provide additional info: who performs an action, in what space, etc.

How many audit records would we expect to see here? Somewhere between 1 and 3?

The same logic here. I expect the only one event here - Copied to space. Users do not think in terms of Export/ bulk_get saved objects / Import / bulk_create saved objects. As you said, they are implementation details. However, users can find correlated low-level events in the Kibana logs via a request identifier / a background task identifier.

I only bring this up because it's not immediately clear to me where we'll choose to generate/emit these audit events.

The Infrastructure level (ES / SO clients) cannot emit domain events. A plugin code emits them. Depending on the plugin workflow, it can be done:

I proposed to use Audit Trail service that receives those domain events and calculates data for to build Audit Logging Record:

// in plugin code
auditTrail.add({event, message, request});
// in http request handler context can be bound to a request
auditTrail.add({event, message});
// in background task we haven't got a context pattern and might have to introduce one
auditTrail.add({event, message});

// in audit trail plugin code
class AuditTrail {
  on(event, message, request){
    const auditData = {
      message,
      action: 'authenticationSuccess'
      user: security.getUser(request),
      spaces: spaces.getSpace(request),
      server: core.http.getServerInfo(),
     ...
  }
  // has a well-known prefix
  log.logger(auditData);
}

Audit Logger doesn't deal with any observability concerns (ES query performance, for example).

Let me know if it makes sense to you or if I missed something.

legrego commented 4 years ago

That all makes sense, thanks. My primary question was how we would allow plugin code to emit events. Something like auditTrail.add({event, message, request}) makes perfect sense to me.

My initial confusion was around registerAuditable, and then I got distracted with those two examples I put up. So registerAuditable would be a hook provided by core, which the security plugin (for example) could call in order to be notified about all emitted audit events? Similar to how core provides a hook for security to register the auth provider?

mshustov commented 4 years ago

So registerAuditable would be a hook provided by core, which the security plugin (for example) could call in order to be notified about all emitted audit events?

I'd expect it to be used by AuditTrail plugin to extend the platform. There are several benefits of using it in this manner:

AuditTrail plugin can depend on any plugin and uses plugin public API to calculate audit data:

// package.json
requiredPlugins: ['security', 'spaces'],
// plugin.ts
class AuditTrail {
  on(event, message, request){
    const auditData = {
      message,
      action: 'authenticationSuccess'
      user: security.getUser(request),
      spaces: spaces.getSpace(request),
      server: core.http.getServerInfo(),
     ...
  }
  // has a well-known prefix
  log.logger(auditData);
}

platform.registerAuditable(auditTrail.on)

Probably registerAuditable is not the best name. Is registerAuditor more clear?

Also, I'd like to hear from Josh. He might have a different vision.

jportner commented 4 years ago

Good idea making a diagram @legrego --

OK, so Approach #1 as described above is to generate a single audit event for each user request.

How many audit records would we expect to see here? Somewhere between 1 and 3?

  1. "Copy to space" record
  2. "Export/ bulk_get saved objects" record
  3. "Import / bulk_create saved objects" record

In Approach #2 that I've been thinking of, we would see five audit records:

image

In my mind it would look something like this.

Click to see JSON ``` { "event": { "action": "read sourcespace dashboard", "category": "saved_objects_authorization", "module": "plugin:security", "outcome": "success", }, "trace": { "id": "some-uuid" } } { "event": { "action": "bulk_get [sourcespace:dashboard:foo]", "category": "saved_objects_client", "module": "core", "outcome": "success", }, "trace": { "id": "some-uuid" } } { "event": { "action": "write destspace dashboard", "category": "saved_objects_authorization", "module": "plugin:security", "outcome": "success", }, "trace": { "id": "some-uuid" } } { "event": { "action": "bulk_create [destspace:dashboard:foo]", "category": "saved_objects_client", "module": "core", "outcome": "success", }, "trace": { "id": "some-uuid" } } { "event": { "action": "POST", "category": "http", "module": "core", "outcome": "success", }, "http": { "request": { "body": { "content": "{\"objects\":[{\"type\":\"dashboard\",\"id\":\"foo\"}],\"spaces\":[\"destspace\"],\"includeReferences\":true,\"overwrite\":true}" }, "method": "POST" } }, "source": { "address": "12.34.56.78", "ip": "12.34.56.78" }, "url": { "domain": "www.somekibanahost.com", "full": "https://www.somekibanahost.com/api/spaces/_copy_saved_objects", "path": "/api/spaces/_copy_saved_objects", "port": "443", "query": "", "scheme": "https" }, "user": { "email": "john.doe@company.com", "full_name": "John Doe", "hash": "D30A5F57532A603697CCBB51558FA02CCADD74A0C499FCF9D45B...", "sid": "2FBAF28F6427B1832F2924E4C22C66E85FE96AFBDC3541C659B67...", "name": "jdoe", "roles": [ "kibana_user" ] }, "trace": { "id": "some-uuid" } } ``` _Note 1: I omitted some attributes in the interest of brevity._ _Note 2: each record can be correlated with each other by trace.id (which should also be sent to Elasticsearch as `X-Opaque-Id`)._ _Note 3: the four records in the audit trail with the `saved_objects_client` and `saved_objects_authorization` categories wouldn't need to contain **all** of the attributes (`http`, `source`, `url`, `user`) -- however, the "events" that these records are generated from would still need to have this info. This is because we want to be able to add a filter to avoid writing records based on certain attributes, such as user or IP address._

So, this approach would generically audit all API routes and SOC calls. It would show what's happening "under the hood" for the SOC and its wrappers. Of course this is more verbose than the alternative of writing a single audit event for each request.

Potential advantages of Approach #2:

Disadvantages:

Thoughts?

joshdover commented 4 years ago

Probably registerAuditable is not the best name. Is registerAuditor more clear?

Also, I'd like to hear from Josh. He might have a different vision.

I think we're on the same page here. The only part I'm confused about in your example is the auditTrail.add API. This is meant to be a Core API, right? Not an API on the audit trail plugin.

If we're on the same page there, then the final result is Platform would need to expose two APIs:


In terms of what produces the audit events themselves (@jportner's discussion above), I think I do favor Approach #2 for its completeness. It seems less likely that we may miss an critical event that should be included in the audit log if we log the lower level details. That said, I'm not very familiar with how audit logs are used by customers. If the low-level logs are too opaque to understand, that could make these logs much less useful.

So really it seems the question is: do we favor completeness or clearer semantics?

Could we do both? Could the semantic, high-level action be provided as a "scope" for the lower-level audit events?

For example, what if we had an API that allows an HTTP endpoint to start a auditable event scope so that all audit events that are produced while that scope is open are associated with the high-level semantic action.

router.post(
  { path: '/api/do_action' },
  async (context, req, res) => {
    const auditScope = context.audit.openScope('copy_to_space');
    try {
      // Any audit events produced by SO client while scope is open 
      // would be associated with the `copy_to_space` scope.
      const res = await copyToSpace(context.savedObjects.client);
      return res.ok({ body: res });
    } finally {
      auditScope.close();
    }
  }
);

Or we could change the API a bit to:

router.post(
  { path: '/api/do_action' },
  async (context, req, res) => context.audit.openScope(
    'copy_to_space',
    async () => {
      // Any audit events produced by SO client while scope is open 
      // would be associated with the `copy_to_space` scope.
      const res = await copyToSpace(context.savedObjects.client);
      return res.ok({ body: res });
    }
  )
);

The tricky part about this in Node.js is that these async actions are running in the same memory space, which makes associating the scope with any asynchronous code difficult. Couple options for solving:

mshustov commented 4 years ago

If we're on the same page there, then the final result is Platform would need to expose two APIs: registerAuditor for receiving audit events This is the API that audit log plugin would use to get all events, enrich with additional data, and forward to a logger. auditTrail.add / addAuditEvent / someOtherName for adding audit events This is the API that Core, OSS plugins, and commercial plugins would use to add domain-events for user actions (eg. Copy to Space). These events are forwarded to any auditors registered with registerAuditor.

Correct 👍

The tricky part about this in Node.js is that these async actions are running in the same memory space, which makes associating the scope with any asynchronous code difficult.

AFAIK Nodejs provides built-in primitives that we can try to use for this case https://nodejs.org/api/async_hooks.html it's time to finally watch https://www.youtube.com/watch?v=omOtwqffhck from @watson 😄

joshdover commented 4 years ago

AFAIK Nodejs provides built-in primitives that we can try to use for this case nodejs.org/api/async_hooks.html

I agree async_hooks could be a solution. My concern is just that it's still in experimental, even in the latest Node version. It does look like the working group is discussing stabilization. If it does go stable in v14 LTS, it could be a viable option for us.

thomheymann commented 4 years ago

Hi team, I'm new to the project and am starting to get up to speed with the audit log feature.

From speaking to different people there still seem to be a few outstanding questions and different ideas as to what the audit log should provide, to what level of detail and how it differs from existing logging.

In order to help us define a clear approach I wanted to define some guiding principles that we can agree on and then refer back to when making a decision about whether something should be included in the audit log or not and what the implementation should look like.

I have written these as statements but they are all open questions / up for debate.

I might have gotten this completely wrong so would be great to get your thoughts!

Guiding Principles

What’s the difference between our audit log and system log?

What events need to be captured?

When are events logged?

Can an action trigger multiple events (log lines)?

How does Kibana audit logging tie into ElasticSearch audit logging?

Examples

thomheymann commented 4 years ago

ECS Audit Log Proposal

Field Reference: https://www.elastic.co/guide/en/ecs/current/ecs-field-reference.html

Approach

Authorisation / privilege checks are logged as an outcome of an action rather than as a separate log line since they are implementation details. This is the same approach as error/success results in ECS standard.

Bulk operations are logged as separate events. It would be less verbose to combine a bulk operation into a single log line but that would mean that we can't record successes/failures individually using ECS standard. Saved object details are extracted into a non-standard document field for each audit event.

category, type and outcome fields are categorisation fields in ECS with specific allowed keywords. I tried to map these as good as I can but some of them do sound slightly clunky for our use case.

Events

User Authentication

{
  "message": "User 'jdoe' logged in successfully using realm 'native'|Failed login attempt using realm 'native'|User re-authentication failed",
  "event": {
    "action": "user_login|user_logout|user_reauth",
    "category": ["authentication"],
    "type": ["user"],
    "outcome": "success|failure",
    "module": "kibana",
    "dataset": "kibana.audit"
  },
  "error": {
    "code": "spaces_authorization_failure",
    "message": "jdoe unauthorized to getAll spaces",
  },
  "trace": {
    "id": "opaque-id"
  }
}

Saved Object CRUD

{
  "message": "User 'jdoe' created dashboard 'new-saved-object' in space 'default'",
  "event": {
    "action": "saved_object_create",
    "category": ["database"],
    "type": ["creation|access|change|deletion", "allowed|denied"],
    "outcome": "success|failure",
  },
  "document": {
    "space": "default",
    "type": "dashboard",
    "id": "new-saved-object"
  },
  "error": {
    "code": "spaces_authorization_failure",
    "message": "jdoe unauthorized to getAll spaces",
  },
  "trace": {
    "id": "opaque-id"
  }
}

HTTP Response

{
  "message": "HTTP request 'login' by user 'jdoe' succeeded",
  "event": {
    "action": "http_request",
    "category": ["web"],
    "outcome": "success|failure",
  },
  "http": {
    "request": {
      "method": "POST",
      "body": {
        "content": "{\"objects\":[{\"type\":\"dashboard\",\"id\":\"foo\"}],\"spaces\":[\"destspace\"],\"includeReferences\":true,\"overwrite\":true}"
      }
    },
    "response": {
      "status_code": 200
    }
  },
  "source": {
    "address": "12.34.56.78",
    "ip": "12.34.56.78"
  },
  "url": {
    "domain": "kibana",
    "full": "https://kibana/api/spaces/_copy_saved_objects",
    "path": "/api/spaces/_copy_saved_objects",
    "port": "443",
    "query": "",
    "scheme": "https"
  },
  "user": {
    "email": "john.doe@company.com",
    "full_name": "John Doe",
    "hash": "D30A5F57532A603697CCBB51558FA02CCADD74A0C499FCF9D45B...",
    "sid": "2FBAF28F6427B1832F2924E4C22C66E85FE96AFBDC3541C659B67...",
    "name": "jdoe",
    "roles": [ "kibana_user" ]
  },
  "trace": {
    "id": "opaque-id"
  }
}

Scenarios

Copy to space

{
  "message": "User 'jdoe' accessed dashboard 'first-object' in space 'default'",
  "event": { "action": "saved_object_read", "category": ["database"], "type": ["access"], "outcome": "success" },
  "document": { "id": "first-object", "type": "dashboard", "space": "default" }
}
{
  "message": "User 'jdoe' accessed dashboard 'second-object' in space 'default'",
  "event": { "action": "saved_object_read", "category": ["database"], "type": ["access"], "outcome": "success" },
  "document": { "id": "second-object", "type": "dashboard", "space": "default" }
}
{
  "message": "User 'jdoe' created dashboard 'first-object' in space 'copy'",
  "event": { "action": "saved_object_create", "category": ["database"], "type": ["creation"], "outcome": "success" },
  "document": { "id": "first-object", "type": "dashboard", "space": "copy" }
}
{
  "message": "User 'jdoe' created dashboard 'second-object' in space 'copy'",
  "event": { "action": "saved_object_create", "category": ["database"], "type": ["creation"], "outcome": "success" },
  "document": { "id": "second-object", "type": "dashboard", "space": "copy" }
}
{
  "message": "HTTP request 'copy-to-space' by user 'jdoe' succeeded",
  "event": { "action": "http_request", "category": ["web"], "outcome": "success" }
}

Error: User not authorised to access dashboard (Kibana authZ):

{
  "message": "User 'jdoe' not authorised to access dashboard 'first-object' in space 'default'",
  "event": { "action": "saved_object_read", "category": ["database"], "type": ["access"], "outcome": "failure" },
  "error": { "code": "spaces_authorization_failure", "message": "jdoe unauthorized to getAll spaces" },
  "document": { "id": "first-object", "type": "dashboard", "space": "default" }
}
{
  "message": "HTTP request 'copy-to-space' by user 'jdoe' failed",
  "event": { "action": "http_request", "category": ["web"], "outcome": "failure" },
  "error": { "code": "spaces_authorization_failure", "message": "jdoe unauthorized to getAll spaces" }
}

Error: Session expired (Kibana authN):

{
  "message": "Unknown user not authenticated to request 'copy-to-space'",
  "event": { "action": "http_request", "category": ["web", "authentication"], "type": ["denied"], "outcome": "failure" }
}

Error: User not authorised to access data index (ElasticSearch authZ):

{
  "message": "User 'jdoe' not authorised to access index 'products'"
}
{
  "message": "HTTP request 'copy-to-space' by user 'jdoe' failed",
  "event": { "action": "http_request", "category": ["web", "authentication"], "type": ["allowed"], "outcome": "failure" }
}

User login

{
  "message": "User 'jdoe' logged in successfully using realm 'native'",
  "event": { "action": "user_login", "category": ["authentication"], "type": ["user"], "outcome": "success" }
}
{
  "message": "HTTP request 'login' by user 'jdoe' succeeded",
  "event": { "action": "http_request", "category": ["web"], "outcome": "success" }
}

Open question

mshustov commented 4 years ago

@thomheymann thank you for the logging format proposal. I have a couple of questions about the Events section.

thomheymann commented 4 years ago

Thanks for feedback Mikhail!

Is it the complete list of events for the first stage of Audit Logging? Or it's just the list for the First phase / just an example of sub-set of events.

These are only example events, there are a lot more events we would audit but I wanted to establish some kind of a pattern first since most of the other events would follow a similar approach. I've added a list of the possible other events below. (again, not complete / reviewed)

What support for HTTP events required from the platform team side? I suspect we don't have to track all response, but only for selected routes using its context-specific information.

The way I understood HTTP based audit logging is that it's a way of very quickly and easily getting most of our auditing requirements ticked off without forcing plugin authors to manually create audit specific events. It feeds into one of my open questions though around the overlap of these (i.e. do we need an http_request event for the login route in our audit log if we already log user logins as a separate event?)

Is audit for SO actions performed by the Security plugin? The SO client from the core doesn't know about authz/authc restrictions.

I have no view on this at this point, I'm purely looking at it from a requirements perspective. Would be great to get a steer in terms of what is actually feasible based on the implementation.

legrego commented 4 years ago

Thanks for the writeup @thomheymann! A quick note on your guiding principles:

Do log what indices / records were accessed

When discussing how this ties into ES audit logs, you menion:

Maybe record level audit logging could be left to ElasticSearch?

I agree with this. I wouldn't expect Kibana to log responses returned by ES that result from queries against users' data indices.

The full list of events might be easier to curate and discuss in a google doc. Entries under user and role management should be left to ES audit logs, as they are the authoritative source of this information. I expect logstash pipelines fall into this category as well.


What support for HTTP events required from the platform team side? I suspect we don't have to track all response, but only for selected routes using its context-specific information.

The way I understood HTTP based audit logging is that it's a way of very quickly and easily getting most of our auditing requirements ticked off without forcing plugin authors to manually create audit specific events. It feeds into one of my open questions though around the overlap of these (i.e. do we need an http_request event for the login route in our audit log if we already log user logins as a separate event?)

At the most verbose level, we may want to include everything, or almost everything here. The ability to filter this out will be critical though, and it'll probably make sense to come up with a sensible configuration so that we don't log everything by default, but instead allow administrators to opt-in to more granularity.

Perhaps the platform could add a route option to the interface to allow a route to exclude itself from auditing, if we find that we need this flexibility.


Is audit for SO actions performed by the Security plugin? The SO client from the core doesn't know about authz/authc restrictions.

I have no view on this at this point, I'm purely looking at it from a requirements perspective. Would be great to get a steer in terms of what is actually feasible based on the implementation.

I'm leaning towards having the security plugin log these events (it's what we do today). It's technically possible to create a SOC without the security wrapper applied, but in those cases, we'd expect consumers to audit their own SO events. Alerting is one such example: https://github.com/gmmorris/kibana/blob/alerting/consumer-based-rbac/x-pack/plugins/alerts/server/authorization/alerts_authorization.ts#L158

legrego commented 4 years ago

Bulk operations are logged as separate events. It would be less verbose to combine a bulk operation into a single log line but that would mean that we can't record successes/failures individually using ECS standard. Saved object details are extracted into a non-standard document field for each audit event.

There might be an exception to this that I'm overlooking, but I believe all bulk operations are all-or-nothing today, so we don't have a need for logging success/failures individually. Our current approach (which isn't necessarily the right one) is to log bulk operations as a single entry, but that entry identifies the objects in question. Verbosity aside, I worry about the performance of logging bulk operations as separate events. An export of 10,000 saved objects would require approximately 10,000 audit log entries, which could take a non-trivial amount of time.

How does generic API request logging (http_request) tie into the other audit events. For bulk operation these make sense as it groups the other events together. For single operation requests it feels like unnecessary duplication. (See user_login example)

It might be unnecessary duplication, but I think it's hard to definitively say that a certain API endpoint will only ever do a single operation. We could attempt to tag routes as such, but that requires manual effort on the engineering side which could be easily overlooked during a seemingly unrelated refactor. At the moment, I'm thinking we'll accept the duplication since we'll have the ability to filter events, but we can always revisit this if we find a clear pattern to these events


I'm interested in hearing other thoughts though! My opinions here are just that.

legrego commented 2 years ago

Closing this meta issue, as we have sub-issues open to track the remaining individual tasks that we care about at this time.

mbudge commented 8 months ago

Please can you add the saved object name/description so we can provide reports to IT controls?

Reports with the saved object ID aren't user friendly.

legrego commented 8 months ago

@mbudge your request is being tracked here: https://github.com/elastic/kibana/issues/100523. edit: I see you discovered this already