Closed jmartinm closed 7 years ago
Actually just realised that the issue is already there - https://github.com/inspirehep/inspire-next/issues/1792
@chris-asl Is this something you can have a look at?
While it's true that OR queries are broken, the query that it's trying to do here is the product of insufficient API design: we should never need to reparse queries that we generate internally!
According to this
The clause (query) should appear in the matching document. If the bool query is in a query context and has a
must
orfilter
clause then a document will match the bool query even if none of theshould
queries match. In this case these clauses are only used to influence the score. If thebool
query is a filter context or has neithermust
orfilter
then at least one of theshould
queries must match a document for it to match the bool query. This behavior may be explicitly controlled by settings the minimum_should_match parameter.
So, we currently have a bool
query in a query
context, which has a filter
clause and we're falling into the case of a document will match the bool query even if none of the should
queries match.
This means, that when an ElasticSearch query object is created with a filter, we should be adding the "minimum_should_match": 1
, as @jmartinm suggested also here.
I will create a PR on invenio-search
.
The issue is resolved since https://github.com/inveniosoftware/invenio-search/pull/105 has been merged to upstream. Currently title holography or bosons
returns three results, as it should.
Performing the following query:
control_number:1497201 OR control_number:1498589
generates:which is returning all results (instead of the expected 2).
I think the problem is due to the use of
should
together withfilter
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html#query-dsl-bool-query