elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.63k stars 24.64k forks source link

[ES|QL] - Using a wildcard "*" on a none existing index together with a existing index, then the user doesn't get an error from ES #104987

Open ninoslavmiskovic opened 7 months ago

ninoslavmiskovic commented 7 months ago

Elasticsearch Version

8.13

Installed Plugins

No response

Java Version

bundled

OS Version

MacOS - latest

Problem Description

When I use FROM with an index that does not exist together with a wildcard: "*" and an index that does exist, then Elasticsearch does not come back with errors, but Kibana does.

Example:

from kibana_sample_data_logs, noneexistingindex* | limit 10

Adding recording to show-case the error.

Steps to Reproduce

See the recording to re-produce.

Basically you go use this query:

from kibana_sample_data_logs, noneexistingindex* | limit 10

https://github.com/elastic/elasticsearch/assets/108192783/d818d2f3-0c9c-4da8-931a-0268f3f053a2

,

Logs (if relevant)

No response

elasticsearchmachine commented 7 months ago

Pinging @elastic/es-analytical-engine (Team:Analytics)

not-napoleon commented 7 months ago

The ask here, as I understand it, is for ES|QL to fail the entire query if an index wild card returns zero results. That has the potential to break existing queries, so I've labeled this as a breaking change issue.

costin commented 7 months ago

The behavior can be customized by allowing the user to specify the preferred indices option. (Thanks to @astefan for pointing out). We should look into exposing this on the request side and/or command itself.

dej611 commented 7 months ago

Documenting here all the current behaviours:

I do not mind either fail or make it fail silently as long as one index is available, the problem I see here is a lack of consistency. One way would be to expose the indices option (a setting mode within FROM?) or align the behaviour, or maybe there's another idea here. As long as we find a single rule for them it would be great.

timfrietas commented 7 months ago

I agree, consistency makes sense here. I would think the default alignment of throwing an error due to a non-existent index makes sense in any combination regardless of expansion, but if that default is both expensive and inconsistent with defaults outside of ES|QL I'm interested in cost of the alternative(s).

astefan commented 6 months ago

Linking here the work in progress: https://github.com/elastic/elasticsearch/pull/106636

For the record, this would make a query where the index (without wildcard) name that doesn't exist to not generate an error. For example: FROM employees, nonexistent OPTIONS "ignore_unavailable" = "true" | limit 3 will succeed and return rows from employees.

For a query like FROM nonexistent1, nonexistent2 OPTIONS "ignore_unavailable" = "true" | limit 3 we will still return "unknown index error message". I've created https://github.com/elastic/elasticsearch/issues/106805 to investigate the option of returning an empty response in this case.

astefan commented 5 months ago

Closing as fixed with https://github.com/elastic/elasticsearch/pull/106636. This adds options to a from query so that users choose the desired behavior. Also, I don't think this is Breaking anymore, since it doesn't change any defaults.

bpintea commented 4 months ago

Reopening after feature reverting in https://github.com/elastic/elasticsearch/pull/108692.

ioanatia commented 2 months ago

Have we thought about making ignore_unavailable and allow_no_indices parameters of the API rather than a language feature? For example a query parameter POST _query?allow_no_indices=true or part of the request body?

This has several advantages:

@astefan @bpintea wdyt? since we removed the OPTIONS feature, how do you think we should approach this?

bpintea commented 2 months ago

@bpintea wdyt? since we removed the OPTIONS feature, how do you think we should approach this?

@ioanatia, OPTIONS was removed not b/c of language considerations, but because of the used underlying functionality. So it's not about how we use these ignore_unavailable and allow_no_indices toggles (URL param or language features), but what to use instead, which isn't yet defined. Tracked here.