Open wangchao732 opened 1 month ago
Pinging @elastic/es-search (Team:Search)
Could you provide text about whats happening, what you expect to happen, and how to reproduce the problem?
Could you provide text about whats happening, what you expect to happen, and how to reproduce the problem?
For example, if I query today's error log information, I can find it in Kibana, but the query syntax can only match one?
"2024-07-17 10:48:00.908 ERROR 1 --- [io-8080-exec-71] u.t.b.d.config.GlobalExceptionHandler : JSON parse error: Cannot deserialize value of type java.lang.Double
from String \"未上报\": not a valid Double
value; nested exception is com.fasterxml.jackson.databind.exc.InvalidFormatException: Cannot deserialize value of type java.lang.Double
from String \"未上报\": not a valid Double
value"
"2024-07-17 08:19:37.157 ERROR 1 --- [TB-Scheduling-1] o.t.server.dao.service.DataValidator : Asset object is invalid: [Asset is referencing to non-existent tenant!]"
Why do these two pieces of information "object is invalid" do not match, and the result is that the other information also does not match?
@wangchao732 You can see why a particular document matches with a detailed explanation if you use the explain
API.
This should help you debug.
Additionally, to see the completely rewritten Lucene query that is being executed, you can pass your query validate API: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-validate.html
With the parameter rewrite: true
it will provide you with the fully rewritten query.
These two APIs should get you much more information about why a query matches a particular document.
Asset object
curl -XGET -H 'Content-Type: application/json' -u xxx "https://xxx:9200/k8slog-2024.07.17/_explain/AF8hvpABVEXVlDRloT9m?pretty" --insecure -d '{"query": { "bool" : { "must": [{"match": {"fields.namespace": "dapr-application" }},{"match": { "message": "ERROR" }}], "must_not": [ {"match": {"message": "WARN"}},{"match": {"message": "DEBUG"}},{"match": {"message": "INFO"}},{"match": {"message": "object is invalid"}},{"match": {"message": "adThread"}}]} }}'
{ "_index" : "k8slog-2024.07.17", "_id" : "AF8hvpABVEXVlDRloT9m", "matched" : false, "explanation" : { "value" : 0.0, "description" : "Failure to meet condition(s) of required/prohibited clause(s)", "details" : [ { "value" : 0.25241855, "description" : "sum of:", "details" : [ { "value" : 0.1262083, "description" : "weight(fields.namespace:dapr in 3921793) [PerFieldSimilarity], result of:", "details" : [ { "value" : 0.1262083, "description" : "score(freq=1.0), computed as boost idf tf from:", "details" : [ { "value" : 2.2, "description" : "boost", "details" : [ ] }, { "value" : 0.12774244, "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:", "details" : [ { "value" : 11779941, "description" : "n, number of documents containing term", "details" : [ ] }, { "value" : 13385079, "description" : "N, total number of documents with field", "details" : [ ] } ] }, { "value" : 0.44908655, "description" : "tf, computed as freq / (freq + k1 (1 - b + b dl / avgdl)) from:", "details" : [ { "value" : 1.0, "description" : "freq, occurrences of term within document", "details" : [ ] }, { "value" : 1.2, "description" : "k1, term saturation parameter", "details" : [ ] }, { "value" : 0.75, "description" : "b, length normalization parameter", "details" : [ ] }, { "value" : 2.0, "description" : "dl, length of field", "details" : [ ] }, { "value" : 1.9422878, "description" : "avgdl, average length of field", "details" : [ ] } ] } ] } ] }, { "value" : 0.12621024, "description" : "weight(fields.namespace:application in 3921793) [PerFieldSimilarity], result of:", "details" : [ { "value" : 0.12621024, "description" : "score(freq=1.0), computed as boost idf tf from:", "details" : [ { "value" : 2.2, "description" : "boost", "details" : [ ] }, { "value" : 0.12774439, "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:", "details" : [ { "value" : 11779918, "description" : "n, number of documents containing term", "details" : [ ] }, { "value" : 13385079, "description" : "N, total number of documents with field", "details" : [ ] } ] }, { "value" : 0.44908655, "description" : "tf, computed as freq / (freq + k1 (1 - b + b dl / avgdl)) from:", "details" : [ { "value" : 1.0, "description" : "freq, occurrences of term within document", "details" : [ ] }, { "value" : 1.2, "description" : "k1, term saturation parameter", "details" : [ ] }, { "value" : 0.75, "description" : "b, length normalization parameter", "details" : [ ] }, { "value" : 2.0, "description" : "dl, length of field", "details" : [ ] }, { "value" : 1.9422878, "description" : "avgdl, average length of field", "details" : [ ] } ] } ] } ] } ] }, { "value" : 7.267964, "description" : "weight(message:error in 3921793) [PerFieldSimilarity], result of:", "details" : [ { "value" : 7.267964, "description" : "score(freq=1.0), computed as boost idf tf from:", "details" : [ { "value" : 2.2, "description" : "boost", "details" : [ ] }, { "value" : 5.1235533, "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:", "details" : [ { "value" : 78753, "description" : "n, number of documents containing term", "details" : [ ] }, { "value" : 13225155, "description" : "N, total number of documents with field", "details" : [ ] } ] }, { "value" : 0.64479077, "description" : "tf, computed as freq / (freq + k1 (1 - b + b dl / avgdl)) from:", "details" : [ { "value" : 1.0, "description" : "freq, occurrences of term within document", "details" : [ ] }, { "value" : 1.2, "description" : "k1, term saturation parameter", "details" : [ ] }, { "value" : 0.75, "description" : "b, length normalization parameter", "details" : [ ] }, { "value" : 60.0, "description" : "dl, length of field (approximate)", "details" : [ ] }, { "value" : 215.23318, "description" : "avgdl, average length of field", "details" : [ ] } ] } ] } ] }, { "value" : 0.0, "description" : "match on prohibited clause (message:object message:is message:invalid)", "details" : [ { "value" : 1.0, "description" : "message:object message:is message:invalid", "details" : [ ] } ] } ] } }
curl -XGET -H 'Content-Type: application/json' -u xxx "https://xxx/k8slog-2024.07.17/_explain/AF8hvpABVEXVlDRloT9m?pretty" --insecure -d '{"query": { "bool" : { "must": [{"match": {"fields.namespace": "dapr-application" }},{"match": { "message": "ERROR" }}], "must_not": [ {"match": {"message": "WARN"}},{"match": {"message": "DEBUG"}},{"match": {"message": "INFO"}},{"match": {"message": "Asset object"}},{"match": {"message": "adThread"}}]} }}'
{ "_index" : "k8slog-2024.07.17", "_id" : "AF8hvpABVEXVlDRloT9m", "matched" : true, "explanation" : { "value" : 7.5192084, "description" : "sum of:", "details" : [ { "value" : 0.25261974, "description" : "sum of:", "details" : [ { "value" : 0.12630892, "description" : "weight(fields.namespace:dapr in 3921793) [PerFieldSimilarity], result of:", "details" : [ { "value" : 0.12630892, "description" : "score(freq=1.0), computed as boost idf tf from:", "details" : [ { "value" : 2.2, "description" : "boost", "details" : [ ] }, { "value" : 0.12784426, "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:", "details" : [ { "value" : 11791710, "description" : "n, number of documents containing term", "details" : [ ] }, { "value" : 13399816, "description" : "N, total number of documents with field", "details" : [ ] } ] }, { "value" : 0.44908655, "description" : "tf, computed as freq / (freq + k1 (1 - b + b dl / avgdl)) from:", "details" : [ { "value" : 1.0, "description" : "freq, occurrences of term within document", "details" : [ ] }, { "value" : 1.2, "description" : "k1, term saturation parameter", "details" : [ ] }, { "value" : 0.75, "description" : "b, length normalization parameter", "details" : [ ] }, { "value" : 2.0, "description" : "dl, length of field", "details" : [ ] }, { "value" : 1.9422879, "description" : "avgdl, average length of field", "details" : [ ] } ] } ] } ] }, { "value" : 0.12631084, "description" : "weight(fields.namespace:application in 3921793) [PerFieldSimilarity], result of:", "details" : [ { "value" : 0.12631084, "description" : "score(freq=1.0), computed as boost idf tf from:", "details" : [ { "value" : 2.2, "description" : "boost", "details" : [ ] }, { "value" : 0.12784621, "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:", "details" : [ { "value" : 11791687, "description" : "n, number of documents containing term", "details" : [ ] }, { "value" : 13399816, "description" : "N, total number of documents with field", "details" : [ ] } ] }, { "value" : 0.44908655, "description" : "tf, computed as freq / (freq + k1 (1 - b + b dl / avgdl)) from:", "details" : [ { "value" : 1.0, "description" : "freq, occurrences of term within document", "details" : [ ] }, { "value" : 1.2, "description" : "k1, term saturation parameter", "details" : [ ] }, { "value" : 0.75, "description" : "b, length normalization parameter", "details" : [ ] }, { "value" : 2.0, "description" : "dl, length of field", "details" : [ ] }, { "value" : 1.9422879, "description" : "avgdl, average length of field", "details" : [ ] } ] } ] } ] } ] }, { "value" : 7.2665887, "description" : "weight(message:error in 3921793) [PerFieldSimilarity], result of:", "details" : [ { "value" : 7.2665887, "description" : "score(freq=1.0), computed as boost idf tf from:", "details" : [ { "value" : 2.2, "description" : "boost", "details" : [ ] }, { "value" : 5.122982, "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:", "details" : [ { "value" : 78885, "description" : "n, number of documents containing term", "details" : [ ] }, { "value" : 13239758, "description" : "N, total number of documents with field", "details" : [ ] } ] }, { "value" : 0.6447407, "description" : "tf, computed as freq / (freq + k1 (1 - b + b dl / avgdl)) from:", "details" : [ { "value" : 1.0, "description" : "freq, occurrences of term within document", "details" : [ ] }, { "value" : 1.2, "description" : "k1, term saturation parameter", "details" : [ ] }, { "value" : 0.75, "description" : "b, length normalization parameter", "details" : [ ] }, { "value" : 60.0, "description" : "dl, length of field (approximate)", "details" : [ ] }, { "value" : 215.12988, "description" : "avgdl, average length of field", "details" : [ ] } ] } ] } ] } ] } }
Sorry, With the explain , I can't read the output.
@wangchao732 you changed which doc you were matching between values. I cannot easily follow your concerns. Mixing images & poorly formatted text makes all this unnecessarily difficult. I am guessing you want to know why a single doc doesn't match a single query?
What is the body of that doc that doesn't match the single query?
{
"query": {
"bool": {
"must": [
{
"match": {
"fields.namespace": "dapr-application"
}
},
{
"match": {
"message": "ERROR"
}
}
],
"must_not": [
{
"match": {
"message": "WARN"
}
},
{
"match": {
"message": "DEBUG"
}
},
{
"match": {
"message": "INFO"
}
},
{
"match": {
"message": "Asset object"
}
},
{
"match": {
"message": "adThread"
}
}
]
}
}
}
You can do a validate
with rewrite with your query to see the fully rewritten lucene query that would be ran (with text analysis and everything). This will show you what is executed and hopefully show you whats happening.
Please do that. "match": {"message": "object is invalid"}
is likely being rewritten to three term queries, separated by an OR
. But, it depends on the analyzer, etc. being used.
As for the explain
, it shows that the first doc didn't match as it was excluded. The second doc did.
What is the body of that doc that doesn't match the single query?
yes, I'm using bool to query when matching message documents that contain ERROR but don't contain "object is invalid", The message showed an ERROR and did not appear "object is invalid" but was excluded, but I changed the way to match the message containing the ERROR but not the "Asset object" and got the desired result.
Pinging @elastic/es-search-relevance (Team:Search Relevance)
Elasticsearch Version
8.14
Installed Plugins
No response
Java Version
1.8.0_191
OS Version
liunx centos 7.9
Problem Description
Steps to Reproduce
null
Logs (if relevant)
No response