Open ruflin opened 2 years ago
Pinging @elastic/es-search (Team:Search)
You should be able to use a field alias to "float" it out. But I've marked this as discuss
so folks can talk and see what they'd like to do.
heya @ruflin could you expand on why having a common prefix is a limitation and why you'd need to send these fields to the top-level? Do they replace some existing fields?
Wouldn't the field alias/additional runtime field solution be equivalent to a built-in solution when it comes to preventing a field with same name from holding data? This may stem from the fact that field aliases are defined under properties, so maybe I would try out a runtime field which does not prevent you from indexing data into a field with same name, although that would be shadowed and not accessible at search time.
When we ingest data, we try to map it to ECS The above example is a simplified example of nginx logs. Currently we do it all via ingest pipelines but in many cases, we don't have to index all the data but would like to it with runtime fields instead. The expected outcome is if we extract some fields, these should still be in ECS, for example source.ip
and http.request.method
.
I couldn't fully follow your second comment. But my ideal outcome would be, that I could have documents that have source.ip
as an indexed field inside and other documents on the same data stream where it is a runtime field. But that is an additional goal after I can do the correct mapping to ECS fields. Even better would be if I could convert my composite runtime field to an index runtime field like I can do for other runtime fields.
But my ideal outcome would be, that I could have documents that have source.ip as an indexed field inside and other documents on the same data stream where it is a runtime field.
I see, but then each index would either have the field as a runtime field or as an indexed field? This reminds of the discussion happening in #86536 . The way we envisioned these changes so far is at the next rollover, hence you would not have an index with a mixed approach. In that case a field alias should work? What I was hinting at with the second part of my comment is that field aliases could be re-implemented as runtime fields. Effectively you can already implement a field alias through a runtime field but you need to define a script for it which is not fantastic for the user experience. if a field alias is defined under runtime, an indexed field with same name can still be mapped under properties, although shadowed. Though I was questioning whether this is a concern at all, assuming that each index should have only one variant of the field in question.
Converting a composite runtime field to indexed is on the roadmap, see #77625 .
Great to see composite runtime fields on the roadmap and https://github.com/elastic/elasticsearch/issues/86536 is interesting indeed.
Taking all the above, going back to the initial question and putting aside the discussion around if runtime or mapped field is default on query time, I would still like to be able to set source.ip
directly in the composite runtime field. Does my explanation around ECS help on why this is needed?
I had a chat with @ruflin and I have now a better understanding of the problem. Field aliases can only point to indexed fields, mapped under properties, and not to runtime fields. The current workaround is to create a runtime field with a script that emits the value of example.source.ip
. One follow-up could be that field aliases should really be implemented as runtime fields (see #87969). Even better, one may wonder why there is a need to declare a second field to expose the grok sub-field to the top-level. This last point we have discussed quite a bit when we were designing the composite runtime field, but it does not hurt to look back and discuss it again.
Pinging @elastic/es-search-foundations (Team:Search Foundations)
Composite runtime fields are especially useful in the context of grok / dissect to extract multiple fields at once. AFAIK there is currently the limitation that all these runtime fields need to have the same prefix which has the issue, these fields can not be mapped to ECS properly. Below is a simplified example to demonstrate the problem.
The data should be in the
source.ip
field but because of the limitation it is inexample.source.ip
. I tried to have an alias fromsource.ip
toexample.source.ip
to at least get the query to work but I would also argue this is not a great solution as it would prevent from having documents with actual data in thesource.ip
field itself.