ECS Standard index names

yoda-sec commented 5 years ago

I looked through the ECS repo and other open issues and wasn't able to find anything related to index names. Does the ECS standard have any plans to define index naming conventions to make it easier to correlate similar types of data from different data sources? For example, if I am researching user authentication events for "jsmith", I may want to review audit logs from windows, linux, VPN, MFA, O365, etc and would typically want to start with 1 Kibana query or 1 dashboard that gives me information from all those data sources.

Is there any plan to "map" these types of events to a standard "audit" index or at-least to a standard device type index to make it easier to share alerting and visualization resources across the elastic user base?

MikePaquette commented 5 years ago

@yoda-sec Great topic! Currently indexing strategies, including naming, are beyond the scope of the ECS specification, however I think this is a good discussion, since, as you say, to enjoy re-use and sharing of various analysis content will be impacted by index pattern selection. Let's use this issue for ideas and discussions.

webmat commented 5 years ago

I would reserve index name tweaks for more straightforward situations. For example, Beats does use the beat name and version in index names, so people can work around breaking upgrades when they happen, or grab everything, when all versions in use are aligned.

For ECS, however, the potential amount of indices that follow ECS is too high. Your Beats indices will (soon) be ECS, your Logstash pipelines for various things may follow ECS as well, and finally some partner / third party event streams may also follow it eventually. Managing to get everyone to align correctly on index naming would not really be possible.

So depending on the environment complexity, I see a few ways we can grab ECS data broadly:

Query all your indices, with _exists_:ecs.version
Query all your indices with ecs.version:x.* when you need for example something you know is only in ECS version X.Y and later.
If you have too many indices for these to work, you may want to maintain index aliases that get you only the relevant indices for your needs.

yoda-sec commented 5 years ago

What does "Your Beats indices will (soon) be ECS, your Logstash pipelines for various things may follow ECS as well" refer to? Is that referring to field names or something in the index name tied to ECS?

What about if some type of standard was built around index aliases to provide flexibility and options for how folks like to manage their indices? Looking at process events (since it's fresh in my mind from the other issue :) ), what if ECS said the standard index for sharing content related to endpoint processes was called "sampleindex". All dashboards, watchers, and hopefully other open source tools could then create content around querying this "sampleindex" only (which pushes reusability and sharing in the community).

Anyone who wishes to use shared content would be expected to create an index alias called "sampleindex" that matches whatever their custom Winlogbeat/Sysmon/Carbon Black/etc index naming pattern is for their process data (you could potentially even suggest adding a filtered alias that requires ecs versions to exist or process.* to exist). Thoughts?

webmat commented 5 years ago

Hey @yoda-sec, sorry for the delay here. Let me address a few of your points. Please hit me back if you have more questions.

In the Elastic Stack v7, all Beats and their modules will follow ECS as much as possible, out of the box. Also note:
- The overhauled Kibana migration assistant (6.6, but more so in the upcoming 6.7) will also help you prepare for this migration, help you reindex if you want, etc.
- Following the introductory webinar and blog post, the next blog post will be on how to migrate to ECS. Stay tuned :-)
On index naming / aliases: if a convention emerges from the community, we'll definitely take that into account. Some of what we're working on may actually use index aliases to gather all of the relevant data, too. So I'm sure we'll get there :-)
- Related, if you want to experiment with this, index aliases can have a filter, so only some of the events will come through the alias

webmat commented 4 years ago

Closing as stale. There's no plan to define guidance on index naming.

ypid-geberit commented 4 years ago

I spend some time coming up with an index naming schema that we have been using for some time now. The details are specified in https://github.com/geberit/elastic-helpers/blob/master/Naming%20conventions.md#version-2

Any input is welcome.

ypid-geberit commented 3 years ago

There's no plan to define guidance on index naming.

Seems it is still happening.

People subscribed to this closed issue might be interested in #980 and https://www.elastic.co/blog/an-introduction-to-the-elastic-data-stream-naming-scheme.

ebeahan commented 3 years ago

Thanks for the follow-up, @ypid-geberit.

I wanted to clarify that ECS will note these naming guidelines and restrictions for the data_stream.* fields to align with the new indexing strategy. Still, ECS continues not to have any naming guidance itself. Indexing strategies, including naming, remain out-of-scope for ECS.

The data_stream.* fields and their naming scheme work in tandem with data streams as part of the new indexing strategy for time series data. Sources adopting this new strategy (such as the Elastic Agent) need to follow the data stream's naming guidelines and restrictions.

elastic / ecs

ECS Standard index names #313