Open jmartinm opened 8 years ago
Regarding the extra thing if we add "explain": true
to the query we get an explanation of the results including the words that matched.
e.g.
For the query photomultiplier banana
:
GET hep/record/_search
{
"query": {"multi_match": {"query": "photomultiplier banana", "fields": ["title^3", "title.raw^10", "abstract^2", "abstract.raw^4", "author^10", "author.raw^15", "reportnumber^10", "eprint^10", "doi^10"], "zero_terms_query": "all"}},
"explain": true
}
The explanation of the first record match is:
"_explanation": {
"value": 0.18039167,
"description": "sum of:",
"details": [
{
"value": 0.18039167,
"description": "max of:",
"details": [
{
"value": 0.026151827,
"description": "product of:",
"details": [
{
"value": 0.052303653,
"description": "sum of:",
"details": [
{
"value": 0.052303653,
"description": "weight(abstract:photomultiplier in 9) [PerFieldSimilarity], result of:",
"details": [
{
"value": 0.052303653,
"description": "score(doc=9,freq=1.0), product of:",
"details": [
{
"value": 0.08483324,
"description": "queryWeight, product of:",
"details": [
{
"value": 9.864747,
"description": "idf(docFreq=2, maxDocs=21234)",
"details": []
},
{
"value": 0.008599637,
"description": "queryNorm",
"details": []
}
]
},
{
"value": 0.6165467,
"description": "fieldWeight in 9, product of:",
"details": [
{
"value": 1,
"description": "tf(freq=1.0), with freq of:",
"details": [
{
"value": 1,
"description": "termFreq=1.0",
"details": []
}
]
},
{
"value": 9.864747,
"description": "idf(docFreq=2, maxDocs=21234)",
"details": []
},
{
"value": 0.0625,
"description": "fieldNorm(doc=9)",
"details": []
}
]
}
]
}
]
}
]
},
{
"value": 0.5,
"description": "coord(1/2)",
"details": []
}
]
},
{
"value": 0.18039167,
"description": "product of:",
"details": [
{
"value": 0.36078334,
"description": "sum of:",
"details": [
{
"value": 0.36078334,
"description": "weight(title:photomultiplier in 9) [PerFieldSimilarity], result of:",
"details": [
{
"value": 0.36078334,
"description": "score(doc=9,freq=2.0), product of:",
"details": [
{
"value": 0.13248014,
"description": "queryWeight, product of:",
"details": [
{
"value": 10.270212,
"description": "idf(docFreq=1, maxDocs=21234)",
"details": []
},
{
"value": 0.012899456,
"description": "queryNorm",
"details": []
}
]
},
{
"value": 2.7233012,
"description": "fieldWeight in 9, product of:",
"details": [
{
"value": 1.4142135,
"description": "tf(freq=2.0), with freq of:",
"details": [
{
"value": 2,
"description": "termFreq=2.0",
"details": []
}
]
},
{
"value": 10.270212,
"description": "idf(docFreq=1, maxDocs=21234)",
"details": []
},
{
"value": 0.1875,
"description": "fieldNorm(doc=9)",
"details": []
}
]
}
]
}
]
}
]
},
{
"value": 0.5,
"description": "coord(1/2)",
"details": []
}
]
}
]
},
{
"value": 0,
"description": "match on required clause, product of:",
"details": [
{
"value": 0,
"description": "# clause",
"details": []
},
{
"value": 0.0042998185,
"description": "_type:record, product of:",
"details": [
{
"value": 1,
"description": "boost",
"details": []
},
{
"value": 0.0042998185,
"description": "queryNorm",
"details": []
}
]
}
]
}
]
}
},
Looking at "description": "weight(...)
we can see which word matched and in which field.
Don't forget the transparency! Esp. if foo bar baz behave different from key:foo key1:foo1. Btw: I noticed only yesterday that on labs-holdingpen "Need action" type:arXiv uri:astro* uri:physics* is in fact "Need action" and (type:arXiv or uri:astro* or uri:physics*) . It's what you intuitively want, but you might not expect this.
This is mostly done. @chris-asl can confirm.
After the agreement in https://github.com/inspirehep/inspire-next/issues/609 google-style syntax should be implemented.
foo bar baz
should match all the documents that have at least one of the words (hence an Inveniofoo or bar or baz
), and rank the results by how much the document matches the query(!), hence documents having all the 3 words will be ranked higher and so on.invenio-query-parser
and, if a given query resulted in something not involving any special keyword, to simply pass it over to elasticsearch as a flat query against a combination of all the important fields:"operator" : "and"
? )Extra