elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.57k stars 8.09k forks source link

[DISCUSS] Rethinking index patterns; improvements #35481

Closed mattkime closed 2 years ago

mattkime commented 5 years ago

Note: There are a lot of overlapping issues. Lets aim to identify and address portions of the problem either in this issue or new issues.

Pain points

Features of index patterns that have mixed support and don't map directly to ES indexes:

Index-related issues due to the many kinds of indexes:

Potential solutions

Smarter UI for using / selecting index patterns

Some consequences - Need shared UI, code consuming index patterns shouldn't assume index patterns are complete. Is it possible to select a time field without having a field mapping?

Smarter relationship with field mapping

Some consequences - Need to make sure loading field mapping is async

Question - How expensive is generating the field mapping?

Index naming concerns

I'm not sure how to address this one.

Major related issues & enhancement requests

elasticmachine commented 5 years ago

Pinging @elastic/kibana-app-arch

rayafratkina commented 5 years ago

Thanks @mattkime, very helpful writeup! I think another pain point is lack of visibility when something changed in the indexes and index patterns need to be rebuilt.

Also, not sure how to factor in thinking about scripted fields and, in general, ways to customize data.

For reference, here are related issues

nreese commented 5 years ago

Another thing to consider is Field Level Security (FLS) https://github.com/elastic/kibana/issues/8192. Index Pattern saved object expose field names to all users.

https://github.com/elastic/kibana/issues/34334 was opened against the Maps application but is an Index Pattern issue.

kobelb commented 5 years ago

User needs suffix / prefix for index names to use index pattern, might not have it

Would you mind elaborating on this?

mattkime commented 5 years ago

Would you mind elaborating on this?

If there's no pattern to your index names then you'll have trouble constructing an index pattern.

kobelb commented 5 years ago

For what it's worth, it's not intuitive at all but you can create an index pattern for foo,bar and it'll let you query both foo and bar at the same time.

ruflin commented 5 years ago

Can't create index pattern unless index and document exist

You can and Beats and APM does this today. The problem that this brings is that Kibana shows an error if not data is there yet.

When we rethink index patterns one additional thing I would like to see being considered is that index patterns will likely to be more and more managed from the outside, meaning created by module definitions. For this it would be nice to have something like index pattern inheritance, meaning I can have 5 small index patterns which all apply to the same indices (similar to inheritance in index templates).

mattkime commented 5 years ago

For what it's worth, it's not intuitive at all but you can create an index pattern for foo,bar and it'll let you query both foo and bar at the same time.

I don't think it needs to be intuitive, but it should be explained in the user interface

ruflin commented 5 years ago

Adding two more things to the wishlist here:

monfera commented 4 years ago

While in agreeing with the above pain points, it's hard to incrementally address them one by one with the expectation that if all are solved, all will be good, or that fixing them one step at a time is optimal. The same way that it's hard to incrementally modify a slow helicopter into a fast airplane, though both are well understood things.

In this case, it doesn't even feel like a well-defined construct; more like an evolved hub that links certain things to other things, doing so informally. So it'd be great to tackle all these at one fell swoop, in form of conceptual design. See also my comment on renaming.

mattkime commented 4 years ago

https://github.com/elastic/kibana/issues/40071

mattkime commented 4 years ago

We had a video chat regarding index patterns. While the intention was to talk about naming, particularly the index pattern string (apm-*) vs the index pattern object but the conversation strayed through all index pattern concerns.

Todo - why do these problems exist? They're likely solving a problem. Does the problem being solved still exist?

@jasonrhodes pointed out that index patterns aren't referenced in es documentation. its just a reference to indices with a wildcard. interesting distinction.

@sqren says es meta data may be able to replace much of the field caching.

@jasonrhodes has some scars from working around the existing index patterns api. some development teams hack around it. we should talk to him and other people to understand how the api isn't working well for him.

monfera commented 4 years ago

I mentioned that

  1. We're a search company, indexes themselves should be easy to search/select, we shouldn't have artificial obstacles for the users, ie. tools such as Lens, Graph etc. should be able to work from ES data and metadata directly (several folks said that it looks like most if not all field metadata are now in ES too)
  2. By extension, index patterns could just go away as a requirement
  3. Instead of the index patterns, we could have views (maybe named as views, as common in DBMS, maybe something else) for the purpose of controlling access (eg. group of indices and allowed document fields) for users or user roles
  4. It would also be natural to link computed fields to said views (as computed fields may also be subject to access control), and eg. as SAP and other large entity management systems do, introduce some concept of data domains and data elements, which could include choices like default field formatting

So, in actionable terms,

  1. Where currently an index pattern is needed, just let folks use an index, or an arbitrary set of index selections using regex matching (it might be a hidden object creation if need be, but then the problem is statefulness, resource allocation and possible accumulation, so ideally there's no object, or just something transient)
  2. The index pattern (objects) could be renamed as Views or similar; where users can specify an index (or group of indices) directly, they can also specify a "view index"
  3. These views would have proper, user-given names, rather than using the index group as the name, for numerous reasons
  4. Users or user roles could be assigned with indices (all docs and all fields authorized) or Views (with fine-grained access control, more restricted than full indices)
ruflin commented 4 years ago

@monfera When you mention "views" in the above I always replace it with "alias" in my head.

justinwalz commented 4 years ago

Hi, I was directed here from the discuss forums https://discuss.elastic.co/t/update-index-pattern-to-only-add-fields/200489.

Feature request would be to add an option to the index pattern refresh operation to only add new fields, not remove older ones. Our use case is that we retire old logs somewhat aggressively, so if we refresh during a period where there are no logs with a given field, it will get removed from the pattern. This ends up breaking visualizations.

Thanks, Justin

monfera commented 4 years ago

@ruflin Good call, index aliases seem to be closely related.

tl;dr: What an index alias can do is covered by what I meant by the broader view (which would also cover field selections, scripted fields, format info etc. to preserve current index pattern functionality and be in sync with the general meaning of view, which is an accepted term for RDBMS views as well as noSQL eg. MongoDB views - great list of motives under both links!)


If I understand, index aliases can:

  1. Be a stand in for one index, or multiple indices (in that, they seem to fully overlap with the Kibana regex pattern matching that we call "index patterns") - where one could use an index (or more), one can also use an alias. So this is already a major redundancy between ES and Kibana. Can someone tell why Kibana is not just piggybacking on aliases? I can think of reasons but don't want to second-guess.
  2. Filter results (I assume it can be an arbitrary ES / Lucene query, but I suppose, no SQL, KQL, timelion etc.) - while index patterns don't do that, there's some redundancy as Kibana uses filters here and there, maybe it even has integration with aliases. So it's not relevant for index patterns right now but chances are, we would've added filtering functionality to index patterns eventually 😄 Because if index patterns can select "columns", why not "rows"?
  3. Routing and possibly other neat stuff - irrelevant for index patterns now
  4. Restricting document fields, or other field-specific config seems not to be part of index aliases.

In short, aliases seem like a good foundation for the index pattern part of index patterns.

Now it turns out there's another type of alias, the alias datatype, which is neat in that - if I understand - one can specify a top-level alias field to avoid spelling out a nested document field path each time. So it's somewhat related to index patterns. They're also related to field capabilities which can supply Kibana with info such as "is aggregation for this field in this index supported?". We're on the hunt for metadata, and tools like Lens may benefit from such info if not already using these (any comment @chrisdavies @wylieconlon @flash1293?)

The term alias is an ES-specific term, and implies substitution of an original item (or more) with a shorthand. I think the filtering capability already stretches the meaning of alias.

The term view is broader (DBMS - irrespective of relational, schema-free etc. - or even comp sci) and it could cover

View is then a general capability to

A nontrivial but essential part is figuring out where to draw the line between ES and Kibana, and how to still benefit from ES abstractions where Kibana also has overlapping abstractions.

weltenwort commented 4 years ago

Be a stand in for one index, or multiple indices [...] - where one could use an index (or more), one can also use an alias. So this is already a major redundancy between ES and Kibana. Can someone tell why Kibana is not just piggybacking on aliases?

There is a major limitation of Elasticsearch index aliases compared to the index glob expressions used on the front-end: Index aliases can't span clusters using CCS (elastic/elasticsearch#43312).

With CCS being a more recent concept, this was certainly not part of the initial consideration of Kibana index patterns, but it has become an important use-case.

cjcenizal commented 4 years ago

pointed out that index patterns aren't referenced in es documentation. its just a reference to indices with a wildcard. interesting distinction.

The term index_patterns is part of the index templates API. Possibly it originated there?

mattkime commented 4 years ago

Metric Explorer doing index pattern like custom work. Needs closer inspection to see how it relates to this conversation, just wanted to reference it before I forget. https://github.com/elastic/kibana/pull/43322

and Lens - https://github.com/elastic/kibana/tree/master/x-pack/legacy/plugins/lens/server/routes and https://github.com/elastic/kibana/blob/master/x-pack/legacy/plugins/lens/public/indexpattern_plugin/indexpattern.tsx#L36

AndrewMcQuerry commented 4 years ago

pointed out that index patterns aren't referenced in es documentation. its just a reference to indices with a wildcard. interesting distinction.

The term index_patterns is part of the index templates API. Possibly it originated there?

The usage of index_patterns in the index template API has only existed since 6.x. In 5.x and prior, the field was called template. Thus, the Kibana Index Pattern terminology has been around much longer than the Index Template API.

Each of those two uses of the term "index pattern" can exist without the other and they have no connection other than coincidental naming overlap.

AndrewMcQuerry commented 4 years ago

Historically, we have created various Kibana Index Patterns, with varying use of wildcards since that was the easiest thing to do. However, in practice, that has also caused an extremely confusing list of multiple Index Patterns each with a different purpose.

For example: logs (@timestamp) logs-elasticsearch (@timestamp) logs-kibana (@timestamp) logs- (ingest_timestamp)

In addition, creating these Index Patterns in Kibana will result in a random "Index Pattern ID" chosen during creation. This can be problematic if someone accidentally deletes the Index Pattern.. even during recreation, the Index Pattern ID would be a new ID and all existing saved objects would no longer reference a valid Index Pattern ID.

We are looking to address this by:

  1. No longer create Index patterns using wildcard.
  2. No longer allow Kibana to set a randomly generated Index Pattern ID.
  3. Anything that is to be an "Index Pattern" must first be created as an Alias.
  4. Only Aliases are then allowed to be created as Index Patterns in Kibana.
  5. The "Index Pattern ID" is manually set during creation to be equal to the Index Pattern / Alias.
  6. Curator or ILM is used to manage which indexes are associated with each Alias.
craigboman commented 4 years ago

I'm not sure if the use of the term "alias" makes a lot of sense from an end user perspective. Alias in my mind complicates the discussion even further. Despite this already being an ES label we'd like to borrow into Kibana, an alias doesn't mean anything to Kibana users, the end user like myself.

As we all may know an index "is a list of words or phrases and associated pointers to where useful material relating to that heading can be found in a document or collection of documents." (Wikipedia). As a librarian like myself, what ES devs are referring to as an alias is just an index. An index/alias in this case, is a list of pointers to where more information can be found. If I index all of my daily audit logs into ES/kibana under the index label/pattern "audit-", this would allow me to collect all of the pointers to data for those dates matching the same index pattern ("audit-2019-11-30", or "audit-2019-11-31", etc). From my perspective "audit-" is the audit index with pointers to "audit-2019-11-30". If all of my data in audit- has the exact same mapping, its just one index, not multiple indices; if I'm streaming audit data into the audit index using the audit index pattern, its still just the audit index. I would also go as far as to separate out the variable "" from the index pattern label. For instance instead of "audit-*" it should just be "audit" index. Yes this can be done through the custom index label name, but the custom label should just be default rather than custom.

This also makes sense from a front-end kibana user, in the sense that when I want to change my index pattern, I'm really just changing the index against which I'm trying to visualize, I'm not changing my "alias" when I visualize; that sounds weird. From the front-end, I would highly recommend just dropping the use of index pattern when referring to visualizations. So when I'm visualizing, I would only need to change my index; the "pattern" is extraneous.

Thanks in advance for this consideration. As always: love Kibana. Keep up the great work. :)

petrklapka commented 2 years ago

Thank you for contributing to this issue, however, we are closing this issue due to inactivity as part of a backlog grooming effort. If you believe this feature/bug should still be considered, please reopen with a comment.