tjake / Solandra

Solandra = Solr + Cassandra
Apache License 2.0
884 stars 150 forks source link

highlighting phrases #153

Open kRyszard opened 12 years ago

kRyszard commented 12 years ago

Is it possible to highlight whole query terms? f.e. when I ask for "United States" I want to get <em>United States</em> and not <em>United</em> <em>States</em>. I've searched the whole Internet for an answer, used all combinations of hl.mergeContiguous, hl.usePhrasesHighlighter and hl.highlightMultiTerm parameters and still cannot make it work.

my query is: http://localhost:8983/solandra/idxPosts.proj350_139/select?q=post_text:"Janusz Palikot"&hl=true&hl.fl=post_text&hl.mergeContiguous=true&hl.usePhrasesHighlighter=true&hl.highlightMultiTerm=true

the answer is: ... <arr name="post_text"><str>Tag: <em>janusz</em> <em>palikot</em> - Sowiniec: "Sowiniec"</str></arr> ...

my "post_text" field is: <field name="post_text" type="text" indexed="true" stored="true" termVectors="true" termPositions="true" termOffsets="true" required="true" />

my "text" type is: <fieldType name="text" class="solr.TextField"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory" /> <filter class="solr.TrimFilterFactory" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_pl.txt" /> <filter class="solr.ReversedWildcardFilterFactory" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory" /> <filter class="solr.TrimFilterFactory" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_pl.txt" /> </analyzer> </fieldType>

I also tried to use FastVectorHighlighter with hl.useFastVectorHighlighter=true but encountered an error: Problem accessing /solandra/idxPosts.proj350_139/select. Reason:

-6

java.lang.ArrayIndexOutOfBoundsException: -6 at lucandra.TermFreqVector.getOffsets(TermFreqVector.java:224) at org.apache.lucene.search.vectorhighlight.FieldTermStack.<init>(FieldTermStack.java:100) at org.apache.lucene.search.vectorhighlight.FastVectorHighlighter.getFieldFragList(FastVectorHighlighter.java:175) at org.apache.lucene.search.vectorhighlight.FastVectorHighlighter.getBestFragments(FastVectorHighlighter.java:166) at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlightingByFastVectorHighlighter(DefaultSolrHighlighter.java:509) at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:376) at org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:116) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194) ...

I use the newest version of Solandra with Cassandra 1.0.3

Can you help me, please?