elastic / elasticsearch

Free and Open, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
68.63k stars 24.37k forks source link

Deprecate aggregation ValueScripts in favor of Runtime Fields #69291

Open not-napoleon opened 3 years ago

not-napoleon commented 3 years ago

Value scripts provide a tool for modifying field values at query time. They can be used for things like applying a correction or normalization to values, or changing the types of a field (e.g. parsing a number out of a string field). They also add a lot of complexity to the type system aggregations use (ValuesSource), and their behavior with missing values is not obvious (e.g. if you specify a value script and a missing value, is the script applied to the missing value?). There are also some tricky issues around value scripts (e.g. #53135)

Enter Runtime Fields. By and large, runtime fields solve the same problem - dynamically modifying a field value at query time. Runtime fields are well integrated into the mapping framework and have much better test coverage than value scripts. They're more centrally documented and present a clear, global abstraction as opposed to an implementation detail of an aggregation.

This issue is to discuss if, how, and when we should drop support for value scripts in favor of runtime fields.

Note: this is only referring to value scripts (e.g. https://www.elastic.co/guide/en/elasticsearch/reference/7.11/search-aggregations-metrics-min-aggregation.html#_value_script_6) not "regular" scripts. Value scripts are intended to operate on a single field, while the more general case operates on a whole document.

elasticmachine commented 3 years ago

Pinging @elastic/es-analytics-geo (Team:Analytics)

javanna commented 3 years ago

Thanks @not-napoleon I can think of a couple more places where scripts could be replaced in the search API in favour of runtime fields. Maybe we should discuss these altogether and come up with a deprecation plan for them all.

elasticmachine commented 3 years ago

Pinging @elastic/es-search (Team:Search)

nik9000 commented 3 years ago

Here are a few places where we document script parameters that aren't runtime fields. grep ain't great for this, but its what I have at hand.

$ find docs/reference/ -type f | grep -v ingest | grep -v reindex | grep -v /update | grep -v /bulk | grep -v predicate-tokenfilter | grep -v /runtime | xargs grep -n \"script\" | sort | sed 's/^/* [ ] /'

Anyway, I'm going to rewrite a few of these using runtime fields.

javanna commented 3 years ago

Thanks Nik! I removed the team-discuss label, as we discussed this with the search team, and decided that we should make this issue broader and list all of the different script contexts that we support in the search API and decide which ones can be replaced by runtime fields. I will work on this.

nik9000 commented 3 years ago

Sounds good! Mind if I do the aggs ones? I'm sort of in search of something like this.

javanna commented 3 years ago

please go ahead ;)

nik9000 commented 3 years ago

I'm going to go through my list and convert the scripts to runtime fields where I can. That won't really close this issue because we want to settle on what to deprecate and what not to deprecate. But it'll be nice to update the docs!

javanna commented 3 years ago

The deprecation of the different flavor of scripts in the search API in favor of runtime fields is under discussion, but it's important to note that Kibana scripted fields rely on them in different places, and are not going to be removed in 8.0, hence we are not ready to deprecate in Elasticsearch. We are working on mentioning runtime fields in as many places as possible in the docs (thanks @nik9000 ) to make sure that users start using runtime fields which are more powerful. On the long run, we do expect runtime fields to replace a lot of script usages in search, but we are not quite ready to do that just now.

a03nikki commented 3 years ago

I don't know if this page needs to be updated as well or not, but figured I would mention it as the other doc updates relating to using runtime fields over the (old) scripted fields were done under this ticket.

https://www.elastic.co/guide/en/elasticsearch/reference/current/transform-painless-examples.html

(1) Painless examples for transforms ... (1.2) Getting time features by using aggregations ... (1.4) Counting HTTP responses by using scripted metric aggregation ...

(I added the numbering to the section headers to make it a little easier to follow compared to the 7.13 deployed version of the page.)

The example from (1.2) "Getting time features by using aggregations" uses a script in a average aggregation. This lead me to https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-avg-aggregation.html#_script_2 which uses runtime fields instead of the (older) "script" option on the aggregation. It doesn't mention at all the older (likely deprecated) aggs.{name}.{agg}.script field. The one for the terms agg is also gone as well.

There is also no mention on the runtime field limitations page that they cannot be used with data transforms. Only there is a performance hit for runtime fields for the user-defined time field.

I did not test it, but I would think the two sections (1.2 and 1.4) could use runtime fields instead of an aggregation script.