sul-dlss / sw_index_tests

(Relevancy) tests against the Stanford SearchWorks Solr Index, using rspec-solr and rsolr gems.
Other
9 stars 3 forks source link

Investigate Japanese title searches that returns no results #204

Open cbeer opened 6 years ago

cbeer commented 6 years ago

  225) Japanese Title searches blocking ブロック化 (katakana-kanji mix) behaves like expected result size title search has between 15 and 20 results
       Failure/Error: expect(resp.size).to be >= min

         expected: >= 15
              got:    0
       Shared Example Group: "expected result size" called from ./spec/cjk/japanese_title_spec.rb:8
       # ./spec/support/shared_examples_cjk.rb:7:in `block (2 levels) in <top (required)>'

  226) Japanese Title searches blocking ブロック化 (katakana-kanji mix) behaves like best matches first finds "9855019" in first 1 results
       Failure/Error: expect(resp).to include(id_list).in_first(num).results

         expected response to include document "9855019" in first 1 results: {"responseHeader"=>{"zkConnected"=>true, "status"=>0, "QTime"=>8, "params"=>{"mm"=>"3<86%", "q"=>"{!qf=$qf_title_cjk pf=$pf_title_cjk pf3=$pf3_title_cjk pf2=$pf2_title_cjk}ブロック化", "qs"=>"0", "fl"=>"id", "testing"=>"sw_index_test", "facet"=>"false", "wt"=>"json"}}, "response"=>{"numFound"=>0, "start"=>0, "docs"=>[]}}
         Diff:
         @@ -1,2 +1,15 @@
         -["9855019"]
         +{"responseHeader"=>
         +  {"zkConnected"=>true,
         +   "status"=>0,
         +   "QTime"=>8,
         +   "params"=>
         +    {"mm"=>"3<86%",
         +     "q"=>
         +      "{!qf=$qf_title_cjk pf=$pf_title_cjk pf3=$pf3_title_cjk pf2=$pf2_title_cjk}ブロック化",
         +     "qs"=>"0",
         +     "fl"=>"id",
         +     "testing"=>"sw_index_test",
         +     "facet"=>"false",
         +     "wt"=>"json"}},
         + "response"=>{"numFound"=>0, "start"=>0, "docs"=>[]}}
       Shared Example Group: "best matches first" called from ./spec/cjk/japanese_title_spec.rb:9
       # ./spec/support/shared_examples_cjk.rb:30:in `block (2 levels) in <top (required)>'
cbeer commented 6 years ago

The former results are getting n-grammed differently.. The current #1 hit seems unfortunate, it gets parsed as an 18-gram, which seems suspect. Others n-gramming differences seem potentially better.