solariumphp / solarium

PHP Solr client library
Other
929 stars 302 forks source link

Queries without keywords #28

Closed ebeyrent closed 13 years ago

ebeyrent commented 13 years ago

With the 2.0 RC1 release, I am trying to perform a query based on a field, and I only get results if I have set keywords. This does not produce results:

<?php
$query = $solr->createSelect(); 
$filterQuery = $query->createFilterQuery();
$filterQuery->setKey('lesson_id')
   ->addTag('lesson_id')
   ->setQuery('is_lesson_id:'.$recipe_id);
$query->addFilterQuery($filterQuery);
$result = $solr->select($query);
?>

This does produce results:

<?php
$query = $solr->createSelect(); 
$query->setQuery('penne');
$filterQuery = $query->createFilterQuery();
$filterQuery->setKey('lesson_id')
  ->addTag('lesson_id')
  ->setQuery('is_lesson_id:'.$recipe_id);
$query->addFilterQuery($filterQuery);
$result = $solr->select($query);
?>

What am I doing wrong?

basdenooijer commented 13 years ago

I just tried to reproduce this, but for me a filterquery works just fine without a query string in the main query.

Are you getting an exception, or just no results? In the first case, what exception? In the second case, can you see what query is executed in the Solr log?

ebeyrent commented 13 years ago

No exceptions, no results. Works:

webapp=/solr path=/select params={fl=*,score&start=0&q=penne&wt=json&fq={!tag%3Dlesson_id}is_lesson_id:69459&rows=10} hits=1 status=0 QTime=14

Doesn't work: webapp=/solr path=/select params={fl=,score&start=0&q=:*&wt=json&fq={!tag%3Dlesson_id}is_lesson_id:69459&rows=10} hits=0 status=0 QTime=1

basdenooijer commented 13 years ago

somehow the default values for 'fl' and 'query' are wrong. 'fl' is empty, while this normally should be ',fields' and 'q' has a value of ':' while this should be ':'

Do you use a special config or setup code for solarium?

basdenooijer commented 13 years ago

hmm, markup skewed up my reply. Anyway, the first characters for the fl and q params seem to be missing. Both should be an asterisk

ebeyrent commented 13 years ago

No -

<?php
$config = array(
  'adapteroptions' => array(
    'host' => LocalVars::get('SOLR_HOST'),
    'port' => LocalVars::get('SOLR_PORT'),
    'path' => '/solr/',
    'core' => 'my_core',
  )
);
$solr = new Solarium_Client($config);
?>
basdenooijer commented 13 years ago

It seems like all the code is similar to example 2.3, and that works (at least for me)

What PHP and Solr versions are you using? Maybe that makes a difference.

ebeyrent commented 13 years ago

I am using PHP 5.2.17, and Solr 1.4.

basdenooijer commented 13 years ago

I've just been testing with PHP 5.3.5 and 5.2.11 on both solr 3.2 and 1.4, all results seem ok.

Are the Solr log entries really exactly as they read on this page, or did they get modified by the same github markup that messed up one of my previous posts?

ebeyrent commented 13 years ago

Perhaps. Here it is again:

Jun 16, 2011 9:39:29 AM org.apache.solr.core.SolrCore execute
INFO: [my_core] webapp=/solr path=/select params={mlt.mindf=1&fl=*,score&  
mlt.fl=title,ts_main_ingredient,sm_term_names,sm_themes&start=0&q=*:*&mlt.mintf=1&mlt=true&wt=json&
fq={!tag%3Dlesson_id}is_lesson_id:69459&rows=4} hits=0 status=0 QTime=1
ebeyrent commented 13 years ago

And here it is with the keyword:

Jun 16, 2011 9:41:13 AM org.apache.solr.core.SolrCore execute
INFO: [my_core] webapp=/solr path=/select params={mlt.mindf=1&fl=*,score&
mlt.fl=title,ts_main_ingredient,sm_term_names,sm_themes&start=0&q=penne&mlt.mintf=1&mlt=true&wt=json&
fq={!tag%3Dlesson_id}is_lesson_id:69459&rows=4} hits=1 status=0 QTime=15
basdenooijer commented 13 years ago

Thanks, now all query characters show up. If I compare the query string (the parts between curly braces) they are exactly the same, except for the q param. One has the keyword, and the other one a 'select all' query. Seems ok.

mlt.mindf=1&fl=*,score&mlt.fl=title,ts_main_ingredient,sm_term_names,sm_themes&start=0&q=*:*&mlt.mintf=1&mlt=true&wt=json&fq={!tag%3Dlesson_id}is_lesson_id:69459&rows=4
mlt.mindf=1&fl=*,score&mlt.fl=title,ts_main_ingredient,sm_term_names,sm_themes&start=0&q=penne&mlt.mintf=1&mlt=true&wt=json&fq={!tag%3Dlesson_id}is_lesson_id:69459&rows=4

But it's strange that a less restrictive query (select all) returns no results, where a keyword search does return results. Maybe you can try to disable MLT as a test. Or execute both query strings manually in a browser. Maybe that gets some interesting results.

ebeyrent commented 13 years ago

Could it have something to do with dismax? I notice that when I dump the query object, dismax is one of the components. Is there a way to configure this to use the standard query type?

The current query string fails, which looks like this:

select?q=*:*&start=0&rows=4&fl=*,score&wt=json&fq={!tag=lesson_id}is_lesson_id:69459

However, if I set the qt parameter, I get what I expect:

select?q=*:*&start=0&rows=4&fl=*,score&wt=json&fq={!tag=lesson_id}is_lesson_id:69459&qt=standard
basdenooijer commented 13 years ago

Dismax should not be part of your query object, unless you manually add it. It's an optional component that is only loaded on demand.

The wt param in your querystring sets the responsewriter for Solr to standard, which will be XML. Solarium requires a JSON response so sets it to json. If you test in a browser the standard response writer is probably easier to read, but for solarium this will not work or you need to do your own result parsing.

The response writer is not controlled in any way by dismax or any other component.

ebeyrent commented 13 years ago

"qt", not "wt"

ebeyrent commented 13 years ago

Here's what the dump of the query object looks like. I am not doing anything different than what I posted above:

Solarium_Query_Select Object
(
    [_options:protected] => Array
        (
            [handler] => select
            [resultclass] => Solarium_Result_Select
            [documentclass] => Solarium_Document_ReadOnly
            [query] => *:*
            [start] => 0
            [rows] => 4
            [fields] => *,score
        )

    [_componentTypes:protected] => Array
        (
            [facetset] => Array
                (
                    [component] => Solarium_Query_Select_Component_FacetSet
                    [requestbuilder] => Solarium_Client_RequestBuilder_Select_Component_FacetSet
                    [responseparser] => Solarium_Client_ResponseParser_Select_Component_FacetSet
                )

            [dismax] => Array
                (
                    [component] => Solarium_Query_Select_Component_DisMax
                    [requestbuilder] => Solarium_Client_RequestBuilder_Select_Component_DisMax
                    [responseparser] => 
                )

             [morelikethis] => Array
                (
                    [component] => Solarium_Query_Select_Component_MoreLikeThis
                    [requestbuilder] => Solarium_Client_RequestBuilder_Select_Component_MoreLikeThis
                    [responseparser] => Solarium_Client_ResponseParser_Select_Component_MoreLikeThis
                )

            [highlighting] => Array
                (
                    [component] => Solarium_Query_Select_Component_Highlighting
                    [requestbuilder] => Solarium_Client_RequestBuilder_Select_Component_Highlighting
                    [responseparser] => Solarium_Client_ResponseParser_Select_Component_Highlighting
                )

        )

    [_fields:protected] => Array
        (
            [*] => 1
            [score] => 1
        )

    [_sorts:protected] => Array
        (
        )

    [_filterQueries:protected] => Array
        (
            [lesson_id] => Solarium_Query_Select_FilterQuery Object
                (
                    [_tags:protected] => Array
                        (
                            [lesson_id] => 1
                        )

                    [_query:protected] => is_lesson_id:69459
                    [_options:protected] => Array
                        (
                            [key] => lesson_id
                        )

                )

        )

    [_components:protected] => Array
        (
        )

    [_helper:protected] => 
)
basdenooijer commented 13 years ago

I missed that one... (wt/qt) However that's still strange. If you don't supply a querytype 'standard' is the default. So not supplying it or supplying 'standard' should be the same. Unless you have a special config in your solrconfig.xml file.

As for the query object, the reference to dismax in _componentTypes is only a class mapping, not an actual component instance.

ebeyrent commented 13 years ago

This wasn't an issue with the 1.0 library. It started this morning when I updated to 2.0 RC1. I'm don't think I should be getting different results if there was an issue with the solrconfig.xml file.

I'm grasping at straws here, I don't understand why this isn't working.

basdenooijer commented 13 years ago

The problem is I can't reproduce the issue. In the examples dir of the 2.0.0-RC1 release you can find example 2.3, which as far as I can see does exactly the same. I can run this without issue on the solr example index (included with a standard solr release, see the readme in the example dir)

ebeyrent commented 13 years ago

How would I force the qt param in my query?

basdenooijer commented 13 years ago

Use these three lines:

$request = $solr->createRequest($query)->addParam('qt','standard');
$response = $client->executeRequest($request);
$result = $client->createResult($query, $response);
ebeyrent commented 13 years ago

Sweet - that works perfectly.

basdenooijer commented 13 years ago

Ok! It's a workaround, so I'd like to investigate the issue. Especially since this works in 1.0.

I don't know the nature of the project your working on, but would it be possible for me to take a look at the solr schema.xml and solrconfig.xml? Maybe then I could reproduce the issue and resolve it. If not, I understand.

ebeyrent commented 13 years ago

Sure thing. Basically, I have a custom-built legacy CMS and a few Drupal sites that are indexing their content in Solr. The schema uses the base definitions provided by the Drupal Apache Solr module, and the legacy CMS is mostly using dynamic fields. The solrconfig.xml also comes from the Drupal module.

Both are here: https://gist.github.com/1029599

basdenooijer commented 13 years ago

Thanks! Now I've finally found the issue. The solrconfig.xml has the following settings:

<requestHandler name="partitioned" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="defType">dismax</str>

In short, this requesthandler is set as the default, and it sets the querytype to dismax. The default select all query is not valid for dismax. If you set the qt param in your request you override this and get the default handler so select all works.

This does also disable dismax for all your other searches, this might not be what you want if you search on user input. In that case you would be better of using the dismax component with the setQueryAlternative method. You can supply a search query in normal syntax (so select all works) that will be used if the main query is empty (no user input).

ebeyrent commented 13 years ago

Do you have any documentation for that method? I'm not sure what values are valid to pass in.

 <?php 
 $query->getDisMax()->setQueryAlternative($queryAlternative); 
 ?>
basdenooijer commented 13 years ago

Lots of docs still need to be written for 2.0, including for dismax. QueryAlternative should be a query string in normal query syntax (no dismax) similar to what you would use for setQuery on the select query object. This should work, though I haven't tested it:

 <?php 
 $query->getDisMax()->setQueryAlternative('*:*'); 
 ?>