Open jpountz opened 4 years ago
Pinging @elastic/es-search (:Search/Search)
@jpountz When I was thinking about the changes API, one use-case that I thought for our own products was exactly the Logs application and tailing logs there. I'm curious if you've thought about this in that context as well?
@jasontedor I have thought about it indeed. I don't think that it will be solved entirely by the Changes API because I feel like global ordering by @timestamp
is important for the user experience, and I'm not seeing global ordering as a feature of the Changes API. But building on top of the Changes API might be convenient. Please let me know if you had different expectations.
We don't need the entire feature set of the Changes API, e.g. I don't think we would need to be informed about deletions so another option might be to use _search
and search_after
on the _seq_no
and/or @timestamp
fields at the shard level (both have different pros/cons).
Either way we'd need something on top in order to provide global ordering by @timestamp
as much as possible. E.g. I believe that we'll want to ignore events that are too recent because there might be older events that are not visible yet because they are still indexing or not refreshed yet, these documents would only be returned on a following page.
We discussed it today as a group. This generally felt useful, and while both _search
and the Changes API could be building blocks for this functionality, the Changes API is a more natural fit:
_search
_search
requires polling. So building on the Changes API will help expose this API as a stream that clients can register to as well.This raises interesting questions that we'll need to think about:
Depends on #1242
Thanks for considering this :tada:
While it makes total sense not to duplicate the effort for both APIs I would consider one property pretty important: It should be possible to achieve a consistent in both the changes API as well as _search
. Is that realistic?
The reason is that the latter is probably still going to be used when fetching log entries for past time intervals.
@weltenwort The idea would be that whatever we end up exposing would take care of fetching log entries for past intervals too. The problem with _search
is that it can't guarantee ordering across pages (it only guarantees it within a single page), so either a later page would include events that are older than some events from previous pages, or it would mistakenly ignore some logs if search_after
is used.
That sounds like it would solve the search_after tiebreaker problem for us :heart_eyes: Let me know if you want to validate any API design choice in regard to the Logs UI use case early in the process.
We'll certainly reach out when we start tackling this issue!
Pinging @elastic/es-search (Team:Search)
Pinging @elastic/es-search-foundations (Team:Search Foundations)
Elasticsearch is often used to index logs and live-tailing the logs that match a given filter is a common use-case, but I think we could greatly improve the user experience here. The current approach is to periodically run a query that sorts hits by descending
@timestamp
and use a couple tricks to make these requests run efficiently.But this approach generally delivers messages out-of-order: it's likely that a request returns for the first time an event that is older than the most recent event returned by the previous request. This is mostly due to how we partition data into shards:
Would it be possible to build an API that, assuming that events get pushed to Elasticsearch in order, would be able to live-stream events in order as well?