Closed twagoo closed 6 years ago
Solr "debugQuery" output:
German OR Dutch
"parsedquery":"+(DisjunctionMaxQuery(((continent:German)^0.5 | (country:German)^2.0 | modality:German | (keywords:German)^2.0 | (subject:German)^2.0 | (description:german)^4.0 | (organisation:German)^2.0 | collection:German | (_languageName:german)^2.0 | (name:german)^8.0 | genre:German | text:german | (id:German)^0.1)) DisjunctionMaxQuery(((continent:Dutch)^0.5 | (country:Dutch)^2.0 | modality:Dutch | (keywords:Dutch)^2.0 | (subject:Dutch)^2.0 | (description:dutch)^4.0 | (organisation:Dutch)^2.0 | collection:Dutch | (_languageName:dutch)^2.0 | (name:dutch)^8.0 | genre:Dutch | text:dutch | (id:Dutch)^0.1)))~2 (+DisjunctionMaxQuery(((name:\"german dutch\")^2.0 | description:\"german dutch\"))) (+SolrRangeQuery(name:{* TO *})^2.0) (+SolrRangeQuery(description:{* TO *})) (SolrRangeQuery(_hasPart:{* TO *}) SolrRangeQuery(_resourceRef:{* TO *})) (+(+availability:PUB -availability:ACA -availability:RES -availability:UNSPECIFIED)^0.5) (+(+availability:ACA -availability:PUB -availability:RES -availability:UNSPECIFIED)^0.2) FunctionQuery(double(_hasPartCountWeight))^0.2 FunctionQuery(rord(_daysSinceLastSeen))^0.05 FunctionQuery(rord(_hierarchyWeight))",
"parsedquery_toString":"+((((continent:German)^0.5 | (country:German)^2.0 | modality:German | (keywords:German)^2.0 | (subject:German)^2.0 | (description:german)^4.0 | (organisation:German)^2.0 | collection:German | (_languageName:german)^2.0 | (name:german)^8.0 | genre:German | text:german | (id:German)^0.1) ((continent:Dutch)^0.5 | (country:Dutch)^2.0 | modality:Dutch | (keywords:Dutch)^2.0 | (subject:Dutch)^2.0 | (description:dutch)^4.0 | (organisation:Dutch)^2.0 | collection:Dutch | (_languageName:dutch)^2.0 | (name:dutch)^8.0 | genre:Dutch | text:dutch | (id:Dutch)^0.1))~2) (+((name:\"german dutch\")^2.0 | description:\"german dutch\")) (+(name:{* TO *})^2.0) (+description:{* TO *}) (_hasPart:{* TO *} _resourceRef:{* TO *}) (+(+availability:PUB -availability:ACA -availability:RES -availability:UNSPECIFIED)^0.5) (+(+availability:ACA -availability:PUB -availability:RES -availability:UNSPECIFIED)^0.2) (double(_hasPartCountWeight))^0.2 (rord(_daysSinceLastSeen))^0.05 rord(_hierarchyWeight)",
(German OR Dutch)
"parsedquery":"+(+(DisjunctionMaxQuery(((continent:German)^0.5 | (country:German)^2.0 | modality:German | (keywords:German)^2.0 | (subject:German)^2.0 | (description:german)^4.0 | (organisation:German)^2.0 | collection:German | (_languageName:german)^2.0 | (name:german)^8.0 | genre:German | text:german | (id:German)^0.1)) DisjunctionMaxQuery(((continent:Dutch)^0.5 | (country:Dutch)^2.0 | modality:Dutch | (keywords:Dutch)^2.0 | (subject:Dutch)^2.0 | (description:dutch)^4.0 | (organisation:Dutch)^2.0 | collection:Dutch | (_languageName:dutch)^2.0 | (name:dutch)^8.0 | genre:Dutch | text:dutch | (id:Dutch)^0.1)))) (+DisjunctionMaxQuery(((name:\"german dutch\")^2.0 | description:\"german dutch\"))) (+SolrRangeQuery(name:{* TO *})^2.0) (+SolrRangeQuery(description:{* TO *})) (SolrRangeQuery(_hasPart:{* TO *}) SolrRangeQuery(_resourceRef:{* TO *})) (+(+availability:PUB -availability:ACA -availability:RES -availability:UNSPECIFIED)^0.5) (+(+availability:ACA -availability:PUB -availability:RES -availability:UNSPECIFIED)^0.2) FunctionQuery(double(_hasPartCountWeight))^0.2 FunctionQuery(rord(_daysSinceLastSeen))^0.05 FunctionQuery(rord(_hierarchyWeight))",
"parsedquery_toString":"+(+(((continent:German)^0.5 | (country:German)^2.0 | modality:German | (keywords:German)^2.0 | (subject:German)^2.0 | (description:german)^4.0 | (organisation:German)^2.0 | collection:German | (_languageName:german)^2.0 | (name:german)^8.0 | genre:German | text:german | (id:German)^0.1) ((continent:Dutch)^0.5 | (country:Dutch)^2.0 | modality:Dutch | (keywords:Dutch)^2.0 | (subject:Dutch)^2.0 | (description:dutch)^4.0 | (organisation:Dutch)^2.0 | collection:Dutch | (_languageName:dutch)^2.0 | (name:dutch)^8.0 | genre:Dutch | text:dutch | (id:Dutch)^0.1))) (+((name:\"german dutch\")^2.0 | description:\"german dutch\")) (+(name:{* TO *})^2.0) (+description:{* TO *}) (_hasPart:{* TO *} _resourceRef:{* TO *}) (+(+availability:PUB -availability:ACA -availability:RES -availability:UNSPECIFIED)^0.5) (+(+availability:ACA -availability:PUB -availability:RES -availability:UNSPECIFIED)^0.2) (double(_hasPartCountWeight))^0.2 (rord(_daysSinceLastSeen))^0.05 rord(_hierarchyWeight)",
The solution appears to be to remove the mm
parameter in solrconfig.xml
:
<str name="mm">100%</str>
Just having <str name="q.op">AND</str>
induces the desired behaviour, while seting mm
this way apparently forces a default "all must match" behaviour that can only be overridden by grouping the full clause.
This article seems to have some helpful background information, mostly based on this SOLR issue thread.
After patching alpha as described above, the following yield the same result count:
Dutch OR German
:
"parsedquery":"+(DisjunctionMaxQuery(((continent:Dutch)^0.5 | (country:Dutch)^2.0 | modality:Dutch | (keywords:Dutch)^2.0 | (subject:Dutch)^2.0 | (description:dutch)^4.0 | (organisation:Dutch)^2.0 | collection:Dutch | (_languageName:dutch)^2.0 | (name:dutch)^8.0 | genre:Dutch | text:dutch | (id:Dutch)^0.1)) DisjunctionMaxQuery(((continent:German)^0.5 | (country:German)^2.0 | modality:German | (keywords:German)^2.0 | (subject:German)^2.0 | (description:german)^4.0 | (organisation:German)^2.0 | collection:German | (_languageName:german)^2.0 | (name:german)^8.0 | genre:German | text:german | (id:German)^0.1))) (+DisjunctionMaxQuery(((name:\"dutch german\")^2.0 | description:\"dutch german\"))) (+SolrRangeQuery(name:{* TO *})^2.0) (+SolrRangeQuery(description:{* TO *})) (SolrRangeQuery(_hasPart:{* TO *}) SolrRangeQuery(_resourceRef:{* TO *})) (+(+availability:PUB -availability:ACA -availability:RES -availability:UNSPECIFIED)^0.5) (+(+availability:ACA -availability:PUB -availability:RES -availability:UNSPECIFIED)^0.2) FunctionQuery(double(_hasPartCountWeight))^0.2 FunctionQuery(rord(_daysSinceLastSeen))^0.05 FunctionQuery(rord(_hierarchyWeight))",
"parsedquery_toString":"+(((continent:Dutch)^0.5 | (country:Dutch)^2.0 | modality:Dutch | (keywords:Dutch)^2.0 | (subject:Dutch)^2.0 | (description:dutch)^4.0 | (organisation:Dutch)^2.0 | collection:Dutch | (_languageName:dutch)^2.0 | (name:dutch)^8.0 | genre:Dutch | text:dutch | (id:Dutch)^0.1) ((continent:German)^0.5 | (country:German)^2.0 | modality:German | (keywords:German)^2.0 | (subject:German)^2.0 | (description:german)^4.0 | (organisation:German)^2.0 | collection:German | (_languageName:german)^2.0 | (name:german)^8.0 | genre:German | text:german | (id:German)^0.1)) (+((name:\"dutch german\")^2.0 | description:\"dutch german\")) (+(name:{* TO *})^2.0) (+description:{* TO *}) (_hasPart:{* TO *} _resourceRef:{* TO *}) (+(+availability:PUB -availability:ACA -availability:RES -availability:UNSPECIFIED)^0.5) (+(+availability:ACA -availability:PUB -availability:RES -availability:UNSPECIFIED)^0.2) (double(_hasPartCountWeight))^0.2 (rord(_daysSinceLastSeen))^0.05 rord(_hierarchyWeight)",
(Dutch OR German)
:
"parsedquery":"+(+(DisjunctionMaxQuery(((continent:Dutch)^0.5 | (country:Dutch)^2.0 | modality:Dutch | (keywords:Dutch)^2.0 | (subject:Dutch)^2.0 | (description:dutch)^4.0 | (organisation:Dutch)^2.0 | collection:Dutch | (_languageName:dutch)^2.0 | (name:dutch)^8.0 | genre:Dutch | text:dutch | (id:Dutch)^0.1)) DisjunctionMaxQuery(((continent:German)^0.5 | (country:German)^2.0 | modality:German | (keywords:German)^2.0 | (subject:German)^2.0 | (description:german)^4.0 | (organisation:German)^2.0 | collection:German | (_languageName:german)^2.0 | (name:german)^8.0 | genre:German | text:german | (id:German)^0.1)))) (+DisjunctionMaxQuery(((name:\"dutch german\")^2.0 | description:\"dutch german\"))) (+SolrRangeQuery(name:{* TO *})^2.0) (+SolrRangeQuery(description:{* TO *})) (SolrRangeQuery(_hasPart:{* TO *}) SolrRangeQuery(_resourceRef:{* TO *})) (+(+availability:PUB -availability:ACA -availability:RES -availability:UNSPECIFIED)^0.5) (+(+availability:ACA -availability:PUB -availability:RES -availability:UNSPECIFIED)^0.2) FunctionQuery(double(_hasPartCountWeight))^0.2 FunctionQuery(rord(_daysSinceLastSeen))^0.05 FunctionQuery(rord(_hierarchyWeight))",
"parsedquery_toString":"+(+(((continent:Dutch)^0.5 | (country:Dutch)^2.0 | modality:Dutch | (keywords:Dutch)^2.0 | (subject:Dutch)^2.0 | (description:dutch)^4.0 | (organisation:Dutch)^2.0 | collection:Dutch | (_languageName:dutch)^2.0 | (name:dutch)^8.0 | genre:Dutch | text:dutch | (id:Dutch)^0.1) ((continent:German)^0.5 | (country:German)^2.0 | modality:German | (keywords:German)^2.0 | (subject:German)^2.0 | (description:german)^4.0 | (organisation:German)^2.0 | collection:German | (_languageName:german)^2.0 | (name:german)^8.0 | genre:German | text:german | (id:German)^0.1))) (+((name:\"dutch german\")^2.0 | description:\"dutch german\")) (+(name:{* TO *})^2.0) (+description:{* TO *}) (_hasPart:{* TO *} _resourceRef:{* TO *}) (+(+availability:PUB -availability:ACA -availability:RES -availability:UNSPECIFIED)^0.5) (+(+availability:ACA -availability:PUB -availability:RES -availability:UNSPECIFIED)^0.2) (double(_hasPartCountWeight))^0.2 (rord(_daysSinceLastSeen))^0.05 rord(_hierarchyWeight)",
Note: this is a good use case for #125
Tests on alpha and production confirm that issue has been fixed. Will be included in VLO 4.3.3 (RC1 already tagged and deployed to alpha)
Compare
(Dutch OR German)
: 1,002,722 results at time of writingDutch OR German
: 802 results at time of writing(Note: there is also
Dutch or German
which takes 'or' as a phrase and currently results in 394 records)The
OR
operator should always be supported and not just if the parentheses are present. Investigate why and fix in the Solr configuration if possible!