polyfractal / sherlock

MIT License
119 stars 28 forks source link

Error caused by _cache property in OrFilter #69

Closed samsullivan closed 11 years ago

samsullivan commented 11 years ago

According to the docs (http://www.elasticsearch.org/guide/reference/query-dsl/or-filter/), an or filter must be wrapped in a filters array if you want to use the _cache property. Sherlock did not do this, throwing an exception.

Necessary format:

{
    "filtered" : {
        "query" : {
            "term" : { "name.first" : "shay" }
        },
        "filter" : {
            "or" : {
                "filters" : [
                    {
                        "term" : { "name.second" : "banon" }
                    },
                    {
                        "term" : { "name.nick" : "kimchy" }
                    }
                ],
                "_cache" : true
            }
        }
    }
}

Sherlock's format:

{
    "filtered" : {
        "query" : {
            "term" : { "name.first" : "shay" }
        },
        "filter" : {
            "or" : [
                {
                    "term" : { "name.second" : "banon" }
                },
                {
                    "term" : { "name.nick" : "kimchy" }
                }
            ],
            "_cache" : true
        }
    }
}
svscorp commented 11 years ago
"filter":{
    "or":{
             "filters":[
                           {"term":{"f1":"val","_cache":true}},
                           {"term":{"f2":"val","_cache":true}}
                        ]
            },
            ,"_cache":false
}

This is what I have. Everything is OK, except _cache in each term.

What tag are you using?

polyfractal commented 11 years ago

Hmm, how are you supplying the filters? Inline or as an array?

I'm guessing it is an edge case with the terrible "magic" argument guessing that I'm doing. If the first arg is a FilterInterface it will automatically reformat the query: https://github.com/polyfractal/sherlock/blob/master/src/Sherlock/components/filters/OrFilter.php#L51

However, that's a terrible way to do things, and ES will let you use the "extended" query structure all the time. I'm going to merge your change locally, then rip out the part that tries to guess your intentions.

0.2 can't come fast enough...issues like this should hopefully stop being a problem :(

samsullivan commented 11 years ago

@polyfractal, I am creating an array of QueryInterfaces with $array[] = Sherlock::queryBuilder()->Term(); and passing them to an or filter $filter = Sherlock::filterBuilder()->orFilter()->queries($array).

There's a chance I may have been passing an array of only 1 QueryInterface, though. I can check whether it makes a difference of 1 value vs. many later tonight.

polyfractal commented 11 years ago

Ok, cool. I'll look into why it was doing that (for future reference).

I just pushed your merge so everything should be working. Lemme know if it breaks :)

samsullivan commented 11 years ago

Updated to commit 27a533f, and it works without a flaw, thanks.

polyfractal commented 11 years ago

Awesome. Gonna close this :)

samsullivan commented 11 years ago

So I ran in to a very similar problem when using the Query filter. Not sure if it is something I am doing anything wrong. My code looks something like this:

$original_query = Sherlock::queryBuilder()->MatchAll();
$filters = array();

$words = explode(' ', $query);
foreach($words as $word) {
    $query = Sherlock::queryBuilder()->Wildcard()->field('name')->value($word . '*');
    $filters[] = array(
        'filter' => Sherlock::filterBuilder()->Query()->query($query),
        'boost' => 5,
    );
}

$score_query = Sherlock::queryBuilder()->CustomFiltersScore()->query($original_query)->filters($filters)->score_mode('multiply');

The (invalid) filters section of my CustomFilterScore looks like this:

"filters":[
    {
        "filter":{
            "query":{
                "wildcard":{
                    "name":{
                        "value":"john*",
                        "boost":1
                     }
                }
            },
            "_cache":false
        },
        "boost":5
    },
]

If I change it to this (inside the query object), it works:

"filters":[
    {
        "filter":{
            "query":{
                "wildcard":{
                    "name":{
                        "value":"john*",
                        "boost":1
                     }
                },
                "_cache":false
            }
        },
        "boost":5
    },
]

I'm not sure if I'm doing something wrong, or if it is implemented wrong, or what..but I updated filter/Query.php locally to make it work for me. If you want me to commit the change as a pull request I can. Here's the change:

// Old
public function toArray()
{
    $ret = array(
        'query'  => $this->params["query"]->toArray(),
        '_cache' => $this->params["_cache"],
    );

   return $ret;
}

// New
public function toArray()
{
    $ret = array(
        'query'  => $this->params["query"]->toArray(),
    );
    $ret['query']['cache'] = $this->params["_cache"];

   return $ret;
}
samsullivan commented 11 years ago

Actually that stopped working for me, I looked into it more and a Query filter needs a special format if using the cache parameter.

        "filter" : {
            "fquery" : {
                "query" : { 
                    "query_string" : { 
                        "query" : "this AND that OR thus"
                    }
                },
                "_cache" : true
            }
        }

The fquery needs to wrap the query and contain the _cache parameter. I will make a pull request with this shortly.

polyfractal commented 11 years ago

Sometimes the ES syntax makes me want to throw things.

Thanks for fixing this!

samsullivan commented 11 years ago

I agree, and their documentation is subpar. That's why I started using Sherlock. I was about to give up with building a complex query string by hand haha