elastic / integrations

Elastic Integrations
https://www.elastic.co/integrations
Other
194 stars 418 forks source link

[okta.system] Utilize 'subobjects: false' for debugContext.debugData #9863

Closed andrewkroh closed 1 week ago

andrewkroh commented 3 months ago

Background

The Okta system logs contain a debugContext.debugData field with an undefined schema. To quote the Okta docs:

Important: The information contained in debugContext.debugData is intended to add context when troubleshooting customer platform issues. Both key names and values may change from release to release and aren't guaranteed to be stable. Therefore, they shouldn't be viewed as a data contract but as a debugging aid instead.

The data is indexed into okta.debug_context.debug_data.flattened using the Elasticsearch's flattened type because the field names and data types are not known in advance as per Okta's docs and the integration needs to protect against type conflicts and mapping explosions.

Problem

In Kibana there are some limitations for flattened field types that affect the user-experience (see https://github.com/elastic/kibana/issues/25820). So if we can avoid flattened, then we can work around this limitation.

Suggestion

Since the time the integration was created there is a new mapping feature available in Elasticsearch called subfields that addresses some of the reasons why the integration uses flattened field data types, and doesn't bring the Kibana usability problems. The only protection we would be missing is against a mapping explosion, but I don't think that will be a problem because I suspect Okta has a schema they just don't want to make a public contract for it. Plus integrations have a guardrail in place now to limit field count (see https://github.com/elastic/kibana/pull/178398).

The changes would be to:

The only remaining question is what to do with the existing okta.debug_context.debug_data.flattened. For existing users it must be retained to avoid a breaking-change. We don't have a mechanism by which we can control the default behavior for new installs vs existing installs so I think the only safe option is to provide a toggle to keep the flattened field and make it opt-out. (Please leave a comment if you have other ideas.)

elasticmachine commented 3 months ago

Pinging @elastic/security-service-integrations (Team:Security-Service Integrations)

efd6 commented 3 months ago

This seems like a reasonable approach.

efd6 commented 3 months ago

Although, package spec denies it.

Error: building package failed: invalid content found in built zip package: found 1 validation error:
   1. file "…/github.com/elastic/integrations/build/packages/okta-2.9.0.zip/data_stream/system/fields/fields.yml" is invalid: field 11.fields.0: Additional property subobjects is not allowed
andrewkroh commented 3 months ago

Yes, I noticed that. And have a query out to those involved in adding support for subobjects to package-spec about whether type: group can be permitted to use subobjects.

The working config I observed was that if the current fields below debug_data are elevated to siblings of debug_data and their names are changed to include the debug_data. prefix.