alexshyba / SitecoreSearchContrib

Extension to Sitecore.Search namespace. Includes AdvancedDatabaseCrawler and Searcher. Make sure to check out the website for the project.
http://sitecorian.github.io/SitecoreSearchContrib
25 stars 21 forks source link

MustNot search parameters not working correctly #14

Closed motoyugota closed 11 years ago

motoyugota commented 11 years ago

I have a query where we are trying to search for items that have a partial match in "skus" (a field) and DO NOT match on a specific "type" (another field). The field search parameters are getting created like this:

        var typeSearchParam = new FieldSearchParam
        {
            FieldName = "Type",
            FieldValue = "b2bbd49ff1a04139870eda96480964da",
            Partial = false,
            Condition = QueryOccurrance.MustNot
        };

        var skuSearchParam = new FieldSearchParam
        {
            FieldName = "Skus",
            FieldValue = "0222-1",
            Partial = true,
            Condition = QueryOccurrance.Must
        };

When I step through and look at the query string that is being generated, I get this:

+(+_content:0222-1 +_database:web +_language:"en") -(-_database:web -_language:"en" +type:b2bbd49ff1a04139870eda96480964da) +(+_database:web +_language:"en" +skus:0222-1)

This is incorrect. The correct string, based on these parameters, should be:

+(+_content:0222-1 +_database:web +_language:"en") -(+_database:web +_language:"en" +type:b2bbd49ff1a04139870eda96480964da) +(+_database:web +_language:"en" +skus:0222-1)

The change is that the inner clauses in the second clause should be a "+", not a "-".

I believe the error lies in SearchParams.cs - the ProcessQuery method:

    public virtual BooleanQuery ProcessQuery(QueryOccurance occurance, Index index)
    {
        var innerQuery = new CombinedQuery();
        ApplyFullTextClause(innerQuery, FullTextQuery, occurance);
        ApplyRelationFilter(innerQuery, RelatedIds, occurance);
        ApplyTemplateFilter(innerQuery, TemplateIds, occurance);
        ApplyLocationFilter(innerQuery, LocationIds, occurance);
        AddFieldValueClause(innerQuery, BuiltinFields.Database, Database, occurance);

        if (innerQuery.Clauses.Count < 1)
            return null;

        var translator = new QueryTranslator(index);
        var booleanQuery = translator.ConvertCombinedQuery(innerQuery);

        ApplyLanguageClause(booleanQuery, Language, translator.GetOccur(occurance));

        return booleanQuery;
    }

All of the places where it is using the "occurance" parameter, I believe it should be hard-coded to QueryOccurrance.Must, at least when doing a Must or MustNot (not really sure about "Should" - I don't really get what that is for anyways). By passing in the occurrence and using it here, the inner query strings are all getting that value, which causes the "-" characters in the internal string, which, according to boolean logic, is going to do the opposite of what is intended.

When the FieldSearchParameter appends its clause to the outer clause, it does hard-code QueryOccurrance.Must, so that's why in my example above, the "type" sub-clause has the "+" after it.

Does this make sense? Or am I missing something that would cause the "MustNot" queries work correctly?

motoyugota commented 11 years ago

Oh, and as an aside, how many different ways is "occurrence" misspelled in this project? It has caused confusion multiple times here.

alexshyba commented 11 years ago

Thanks for the detailed description, it does look like a bug. Wondering how come I did not catch this before.

Regarding the spelling, you are absolutely correct, this is unforgivable. In my defense, I must say that it was "inherited" from the "Sitecore.Search.QueryOccurance" enum ;-)

alexshyba commented 11 years ago

Ok, So I've fixed the typos whenever I could and more importantly, fixed the issue with queries not being parsed properly, which resulted in MustNot occurrence not working. This was bundled in a major update. sitecorian/SitecoreSearchContrib@402fe99f21ae2d2186bc17f0889a8ba371adb563

motoyugota commented 11 years ago

Your changes made do not fix this issue. For example, if I am trying to verify that the field "discontinued" does not equal "true" (to get all active items), and I create a FieldSearchParam with FieldName of "discontinued" and FieldValue of "true" and set the condition to "MustNot", it generates the following:

-(-_database:web -discontinued:true)

This does NOT give me the results it should provide. What I need to get is the following:

+(+_database:web -discontinued:true)

The way the code is written right now, this is not possible. SearchParam.ProcessQuery is using the passed in condition on all of the subquery segments that it generates (template ids, locations, database, etc.). I don't know about other derived query types, but for a field query, these should all always be "Must" rather than using the condition on the parameter. If there are no subquery segments, the current code works fine, but with them, it does not work properly. If you make the following changes, it appears to solve the issues, but I have not fully tested it yet, so that's why I'm just posting it here.

QueryRunner.cs

replace lines 206-214 with:

var condition = parameter.Condition;

if (nestedQuery is BooleanQuery)
{
    var booleanQuery = nestedQuery as BooleanQuery;
    if (booleanQuery.Clauses().Count == 0)
    {
        continue;
    }
    else if (booleanQuery.Clauses().Count > 1)
    {
        condition = QueryOccurance.Must;
    }
}

query.Add(nestedQuery, translator.GetOccur(condition));

FieldSearchParam.cs

replace line 46 with:

var baseQuery = base.ProcessQuery(QueryOccurance.Must, index);

The change to FieldSearchParam forces the base query params to be Must, and then the QueryRunner changes again forces the outer boolean query to be a Must if there are multiple clauses, but the inner clauses retain their Must/MustNot/Should values as they should.

Again, I have not tested this against all situations, but I have tested this against every Must and MustNot combination we are currently making use of, and it now works correctly in all of those situations.