elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.7k stars 8.12k forks source link

[Content Management] Standardize on subset of saved object metadata attributes #138343

Open alexfrancoeur opened 2 years ago

alexfrancoeur commented 2 years ago

We've recently kicked off the first phase of content management with the work outlined in this meta issue: https://github.com/elastic/kibana/issues/128006

When complete, we will have an updated reusable table for any saved object to leverage. We'd like to use this component and others to power a variety of SO-type agnostic experiences. In order to create a table that is available that can be used by all SO types, we'll need to either standardize on or create a mapping for a common set of saved object attributes. These attributes include, but aren't limited to: create date, created by, last updated date, last updated by, views, last viewed date, title, version and description.

A screenshot of a limited analysis in 8.2 highlights the differences between how different saved objects persist information.

image

While we'll eventually provide some more detailed requirements for this work, let's use this issue as a place to discuss the broader concept and suggestions for standardization.

Related issues: Add date created, date modified, other metadata to Saved Objects · Issue #9202 · elastic/kibana · GitHub [Visualize] Ability to filter visualisations by metadata (author, created date, modified date etc) · Issue #84227 · elastic/kibana · GitHub [Saved Objects] Report usage and last used date #125795
Kibana usage information · Issue #81130 Google Analytics on Kibana · Issue #5580

elasticmachine commented 2 years ago

Pinging @elastic/kibana-core (Team:Core)

pgayvallet commented 2 years ago

Just to start the discussions:

Looking at the table, I see that some attributes may seems to be already standard (e.g 'attributes.description'), where in fact, technically they're not, at least not in the data storage, as the 'attributes' map is stored in different attributes per type in ES (e.g for a dashboard, 'attributes.description' is stored in the 'dashboard.description' field, and in the 'visualization.description' field for vis).

Now, I would like to be also very careful in what we want to standardize. I'll the rule.name example. The title field was always meant to store the title/name of the object, and yet, I see we diverged for some type, without (apparent) reasons. Same goes for rules.attributes.updatedAt versus the root-level updated_at field.

Now, if the intent is to be able to perform ES queries against these standardized fields, it technically means that all those fields will have to be 'root-level' (not under attributes / {type}) to be able for them to have the same path regardless of the type storing them.

If that's a direction we'd like to follow, we should probably introduce a new meta or something root-level field containing all those standard fields to avoid polluting the top-level SO schema?

So, my first remark / question would be: at which level these 'SO-type agnostic experiences' will be implemented at? As already asked, is the intent to be able to perform queries against them directly on the SO indices?

lukeelmers commented 2 years ago

I would like to be also very careful in what we want to standardize

100%, this is my chief concern too. I think we need to have a high bar for what justifies adding a new root-level field to SOs. Some of these proposed fields certainly could meet that bar, but we should take care to avoid adding root-level fields for concepts that can't be considered truly global to all SOs.

is the intent to be able to perform queries against them directly on the SO indices?

I don't want to speak for @alexfrancoeur, but my understanding was that the intent would be to use the SO client to query across SOs of different types, but not necessarily to query the raw underlying ES indices. Plus, the way we store SOs has always been an implementation detail of the SO client, and I wouldn't want to encourage anyone to interact with that data directly.

Making the assumption that querying via the SO client is acceptable, that means that we could theoretically have standardized field names for SO attributes across types without needing to worry about the fact that they are stored in a way that's different (unless I'm overlooking something, @pgayvallet and @Rudolf keep me honest here).

So I guess I see a few paths forward:

  1. A root-level meta or similar, as Pierre proposes.
  2. A set of helpers for plugins to use to add fields to their existing SO attributes in a standardized format.
  3. Build our components so a mapping of SO attributes to standardized field names must be provided.

(1) Makes a lot of sense to me, as long as we have a way to enforce that folks aren't just putting anything in there... only recognized fields. (2) Would be more conservative, if we really want to start with something small that we could build up from later. Would need to check to make sure there aren't any issues with sorting/paging, but AFAICT this would work with the way we currently build queries in the SO client. (3) Would be great in terms of avoiding complexity at the SO level, but would be more of a pain for consumers. Plus it'd be harder to search across types as you'd need to provide mappings per-type.

rudolf commented 2 years ago

We'd like to use this component and others to power a variety of SO-type agnostic experiences.

The vision we put forward in the BWCA RFC (purpose built APIs + service layer) basically precludes us from building something like this directly on saved objects. @alexfrancoeur and I had a brief discussion about this after he created this issue. There is a real tension here. Anytime you have a single place where you could implement something that improves the whole product that's a huge benefit. But building a powerful generic tool is incredibly hard and right now saved objects without an intermediate API layer just can't power complex domains like our solutions.

So any standardisation needs to happen on the API layer to be useful for the purposes of a shared content table. We've had other requests to document API conventions and I think the best way to support content management would be to document such fields in our API convention.

I was curios how we ended up in a situation where one type has two updated_at / updatedAt fields e.g. for rules. There's a good discussion in https://github.com/elastic/kibana/issues/82701 At the end it came down to the fact that there's no way to not update the updated_at and that either the system or the user makes updates to rules but we only want the user's updates to change the field.

So we would need

Taking one step back I'm really torn about how we go forward here. In terms of the developer experience having the API options to switch and specify for each auto-field whether it's on/off/manual and understanding all the behaviours -- isn't it easier to just say mySO.created_at: new Date().toISOString() or mySO.created_by: currentUserId before updating/creating? Even opinionated ORM's like Active Record don't provide more than created_at and update_at out of the box. Even if I'm wrong and it's actually a better developer experience, I'd still say the gains are marginal.

Then there's the benefit of reducing the field count which is useful, but I'd rather spend time on saved object index per type as a more scalable solution to the same problem.

So if we're focussing on providing a consistent content management experience I wonder if the most value we could add in the shortest amount of time isn't working on API conventions instead?

lukeelmers commented 1 year ago

Then there's the benefit of reducing the field count which is useful, but I'd rather spend time on saved object index per type as a more scalable solution to the same problem.

++

I wonder if the most value we could add in the shortest amount of time isn't working on API conventions instead?

Lately I've been thinking of this idea as "ECS for Saved Object attributes." We give folks a detailed schema for special meta attributes, and, similar to ECS, if they simply follow the schema they can benefit from the super-cool-future-content-management-integrations we will be building. This could potentially still be paired with option (2) from my comment above, where we provide helpers to make this the implementation more bulletproof for devs. And importantly, it would not preclude us from eventually "graduating" fields that feel truly global to be root-level fields (though I still suspect there would be relatively few of these).