collective / collective.recipe.solrinstance

Buildout recipe to configure a Solr instance
https://pypi.python.org/pypi/collective.recipe.solrinstance
5 stars 13 forks source link

Solr4 #5

Closed silviot closed 11 years ago

silviot commented 11 years ago

I tried single core and multiple cores and it seems to work well. I didn't try it with collective.solr. Before merging someone should do that. When I have time I'll check. In the meantime I issue this pull request so that if someone wants to work on solr4 support she can start from here.

lukasgraf commented 11 years ago

I tested this with Solr 4.0.0 and current master of collective.solr. The recipe worked like a charm, no problems whatsoever.

To actually use collective.solr with Solr 4, there's at least one change that needs to be made: In Lucene 4's query parser syntax, / (forward slash) is also considered a special character and needs to be backslash-escaped. This can be achieved by modifiying the query_tokenizer regular expression in collective/solr/queryparser.py accordingly.

With this change I tested live search queries, simple queries from the advanced search form and facets - so far everything worked.

davidjb commented 11 years ago

Tested with Solr 4.0.0 and a development version of Sunburnt. The only issue I've seen is that when starting solr-instance fg, I get a log message:

SEVERE: org.apache.solr.common.SolrException: undefined field text

Rest is at http://pastie.org/5440374. My configuration is super-simple for test purposes - just a single ID field, not named 'text' - and I can't see any reference to such a field in the default configuration.

That said, everything seems to work despite this 'error' happening.

lukasgraf commented 11 years ago

I got the same warning as @davidjb, forgot to mention it.

davidjb commented 11 years ago

I've also noticed that the start/stop functionality of solr-instance isn't perfect either - frequently it will say 'started' with a certain PID but actually be running with another (thus 'stop' doesn't work). Unsure if this is an issue in general with the recipe or just with Solr 4.

lukasgraf commented 11 years ago

The warning about undefined field text is caused by the <str name="df">text</str> directive in solrconfig.xml:

  <requestHandler name="/select" class="solr.SearchHandler">
     <lst name="defaults">
       <!-- ... -->
       <str name="df">text</str> 
     </lst>

In the solrconfig.xml template in collective.recipe.solrinstance for Solr 3.x the <str name="df" /> directive is missing entirely. So it should probably be dropped from the Solr 4 template or changed to searchableText.

davidjb commented 11 years ago

According to Solr's documentation (http://wiki.apache.org/solr/SearchHandler#df), the df option overrides a default field if one is specified in the schema.xml. This will lead to problems since the current recipe configures a default field in schema.xml. So if a user has both a default field set and happens to have another field named text in their schema, then text ends up unknowingly being the default field. My thoughts are to remove this df entirely for now (or else make it configurable, but not enabled by default).

davidjb commented 11 years ago

Also, there's some several other references to the text field too and various other default settings present within the solrconfig.xml too that probably shouldn't be there either (eg settings under the /browse handler for queries, more like this, faceting, etc). They appear as though they're document-specific.

davidjb commented 11 years ago

I've pushed this into the solr4 branch in this repository so that it's easier (possible) for all of us to work on it. See https://github.com/collective/collective.recipe.solrinstance/tree/solr4. I'll open a related issue for Solr 4 support so it's obvious what's happening.

davidjb commented 11 years ago

After some additional work, I've now merged the solr4 branch into master. Thanks again for your work @silviot, greatly appreciated.