swissbib / vufind

A library resource discovery portal designed and developed for libraries by libraries
GNU General Public License v2.0
12 stars 7 forks source link

Erweiterte Suche Alle Felder liefert andere Treffermenge als einfache Suche #671

Closed witzigs closed 5 years ago

witzigs commented 5 years ago

Eine erweiterte Suche über Alle Felder liefert eine andere Treffermenge als eine einfache Suche obwohl beide zumindest meines Wissens gleich konfiguriert sind. Beispiel: spyri heidi einfache Suche: 2358 erweiterte Suche: 903

Gilt für alle drei Oberflächen.

liowalter commented 5 years ago

Infos :

Simple search (2358 results) : http://localhost:8984/solr/green/select?fl=%2A%2Cscore&spellcheck=false&facet=true&facet.limit=100&facet.field=%7B%21ex%3Dunion_filter%7Dunion&facet.field=%7B%21ex%3DnavAuthor_full_filter%7DnavAuthor_full&facet.field=%7B%21ex%3Dformat_hierarchy_str_mv_filter%7Dformat_hierarchy_str_mv&facet.field=%7B%21ex%3Dlanguage_filter%7Dlanguage&facet.field=navSub_green&facet.field=%7B%21ex%3DnavSubform_filter%7DnavSubform&facet.field=publishDate&facet.sort=count&facet.mincount=1&sort=score+desc&q.op=AND&hl=true&hl.simple.pre=%7B%7B%7B%7BSTART_HILITE%7D%7D%7D%7D&hl.simple.post=%7B%7B%7B%7BEND_HILITE%7D%7D%7D%7D&hl.fl=fulltext&hl.fl=0%2Cauthor%2Cauthor_additional%2Cauthor_additional_dsv11_txt_mv%2Cauthor_additional_gnd_txt_mv%2Cseries%2Ctopic%2Caddfields_txt_mv%2Cpublplace_txt_mv%2Cpublplace_dsv11_txt_mv%2Cpublplace_gnd_txt_mv%2Cfulltext%2Clocalcode%2Ctitle_short%2Ctitle_alt%2Ctitle%2Ctitle_sub%2Ctitle_old%2Ctitle_new%2Ctitle_additional_dsv11_txt_mv%2Ctitle_additional_gnd_txt_mv%2Cpublplace_additional_gnd_txt_mv%2Ccallnumber%2Cctrlnum%2CpublishDate%2Cisbn%2Ccancisbn_isn_mv%2Cvariant_isbn_isn_mv%2Cissn%2Cincoissn_isn_mv%2Cid&hl.fragsize=250&wt=json&json.nl=arrarr&rows=20&start=0&qf=title_short%5E1000+title_alt%5E200+title%5E200+title_sub%5E200+title_old%5E200+title_new%5E200+author%5E750+author_additional%5E100+author_additional_dsv11_txt_mv%5E100+title_additional_dsv11_txt_mv%5E100+author_additional_gnd_txt_mv%5E100+title_additional_gnd_txt_mv%5E100+publplace_additional_gnd_txt_mv%5E100+series%5E200+topic%5E500+addfields_txt_mv%5E50+publplace_txt_mv%5E25+publplace_dsv11_txt_mv%5E25+fulltext+callnumber%5E1000+ctrlnum%5E1000+publishDate+isbn+cancisbn_isn_mv+variant_isbn_isn_mv+issn+incoissn_isn_mv+localcode+id&qt=edismax&pf=title_short%5E1000&ps=2&bf=recip%28abs%28ms%28NOW%2FDAY%2Cfreshness%29%29%2C3.16e-10%2C100%2C100%29&mm=0%25&q=heidi+spyri

The query string is heidi+spyri

Advanced search (903 results) : http://localhost:8984/solr/green/select?fl=%2A%2Cscore&spellcheck=false&facet=true&facet.limit=100&facet.field=%7B%21ex%3Dunion_filter%7Dunion&facet.field=%7B%21ex%3DnavAuthor_full_filter%7DnavAuthor_full&facet.field=%7B%21ex%3Dformat_hierarchy_str_mv_filter%7Dformat_hierarchy_str_mv&facet.field=%7B%21ex%3Dlanguage_filter%7Dlanguage&facet.field=navSub_green&facet.field=%7B%21ex%3DnavSubform_filter%7DnavSubform&facet.field=publishDate&facet.sort=count&facet.mincount=1&sort=score+desc&q.op=AND&hl=true&hl.simple.pre=%7B%7B%7B%7BSTART_HILITE%7D%7D%7D%7D&hl.simple.post=%7B%7B%7B%7BEND_HILITE%7D%7D%7D%7D&hl.fl=fulltext&hl.fl=%2A&hl.fragsize=250&wt=json&json.nl=arrarr&rows=20&start=0&q=%28%28%28_query_%3A%22%7B%21edismax+qf%3D%5C%22title_short%5E1000+title_alt%5E200+title%5E200+title_sub%5E200+title_old%5E200+title_new%5E200+author%5E750+author_additional%5E100+author_additional_dsv11_txt_mv%5E100+title_additional_dsv11_txt_mv%5E100+author_additional_gnd_txt_mv%5E100+title_additional_gnd_txt_mv%5E100+publplace_additional_gnd_txt_mv%5E100+series%5E200+topic%5E500+addfields_txt_mv%5E50+publplace_txt_mv%5E25+publplace_dsv11_txt_mv%5E25+fulltext+callnumber%5E1000+ctrlnum%5E1000+publishDate+isbn+cancisbn_isn_mv+variant_isbn_isn_mv+issn+incoissn_isn_mv+localcode+id%5C%22+pf%3D%5C%27title_short%5E1000%5C%27+ps%3D%5C%272%5C%27+bf%3D%5C%27recip%28abs%28ms%28NOW%2FDAY%2Cfreshness%29%29%2C3.16e-10%2C100%2C100%29%5C%27+mm%3D%5C%270%25%5C%27%7Dheidi+spyri%22%29%29%29

Here the query is passed using the magic field query query

(((_query_:"{!edismax qf=\"title_short^1000 title_alt^200 title^200 title_sub^200 title_old^200 title_new^200 author^750 author_additional^100 author_additional_dsv11_txt_mv^100 title_additional_dsv11_txt_mv^100 author_additional_gnd_txt_mv^100 title_additional_gnd_txt_mv^100 publplace_additional_gnd_txt_mv^100 series^200 topic^500 addfields_txt_mv^50 publplace_txt_mv^25 publplace_dsv11_txt_mv^25 fulltext callnumber^1000 ctrlnum^1000 publishDate isbn cancisbn_isn_mv variant_isbn_isn_mv issn incoissn_isn_mv localcode id\" pf=\'title_short^1000\' ps=\'2\' bf=\'recip(abs(ms(NOW/DAY,freshness)),3.16e-10,100,100)\' mm=\'0%\'}heidi spyri")))

But as of solr 7.2, this is not supported any more

https://lucene.apache.org/solr/guide/7_2/solr-upgrade-notes.html#upgrading-from-7-x-releases

The eDisMax parser by default no longer allows subqueries that specify a Solr parser using either local parameters, or the older _query_ magic field trick. For example, {!prefix f=myfield v=enterp} or query:"{!prefix f=myfield v=enterp}" are not supported by default any longer. If you want to allow power-users to do this, set uf=*,query or some other value that includes query. If you need full backwards compatibility for the time being, use luceneMatchVersion=7.1.0 or something earlier.

The code which triggers this behavious (magic field instead of q field) is this : https://github.com/swissbib/vufind/blob/master/module/VuFindSearch/src/VuFindSearch/Backend/Solr/SearchHandler.php#L426

liowalter commented 5 years ago

With Vufind version "Swissbib 4.3.0-core4.1.0" and SOLR 7, there is the same problem.

liowalter commented 5 years ago

With Vufind version 5 and SOLR 4, there is the same number of results for simple search and advanced search. But "Heidi Spyri" returns 16511 results instead of ~2300

liowalter commented 5 years ago

Using vufind-core, we have the following :

simple search : http://localhost:8080/solr/biblio/select?fl=%2A%2Cscore&spellcheck=false&sort=score+desc&hl=true&hl.simple.pre=%7B%7B%7B%7BSTART_HILITE%7D%7D%7D%7D&hl.simple.post=%7B%7B%7B%7BEND_HILITE%7D%7D%7D%7D&wt=json&json.nl=arrarr&rows=20&start=0&qf=title_short%5E750+title_full_unstemmed%5E600+title_full%5E400+title%5E500+title_alt%5E200+title_new%5E100+series%5E50+series2%5E30+author%5E300+author_fuller%5E150+contents%5E10+topic_unstemmed%5E550+topic%5E500+geographic%5E300+genre%5E300+allfields_unstemmed%5E10+fulltext_unstemmed%5E10+allfields+fulltext+description+isbn+issn+long_lat_display&qt=edismax&mm=0%25&hl.fl=title_short%2Ctitle_full_unstemmed%2Ctitle_full%2Ctitle%2Ctitle_alt%2Ctitle_new%2Cseries%2Cseries2%2Cauthor%2Cauthor_fuller%2Ccontents%2Ctopic_unstemmed%2Ctopic%2Cgeographic%2Cgenre%2Callfields_unstemmed%2Cfulltext_unstemmed%2Callfields%2Cfulltext%2Cdescription%2Cisbn%2Cissn%2Clong_lat_display&q=united+states%2A

advanced search http://localhost:8080/solr/biblio/select?fl=%2A%2Cscore&spellcheck=true&facet=true&facet.limit=30&facet.field=topic_facet&facet.field=institution&facet.field=building&facet.field=format&facet.field=callnumber-first&facet.field=author_facet&facet.field=language&facet.field=genre_facet&facet.field=era_facet&facet.field=geographic_facet&facet.field=publishDate&facet.sort=count&facet.mincount=1&sort=score+desc&hl=true&hl.simple.pre=%7B%7B%7B%7BSTART_HILITE%7D%7D%7D%7D&hl.simple.post=%7B%7B%7B%7BEND_HILITE%7D%7D%7D%7D&spellcheck.dictionary=default&wt=json&json.nl=arrarr&rows=20&start=0&spellcheck.q=united+states&hl.fl=%2A&q=%28%28%28_query_%3A%22%7B%21edismax+qf%3D%5C%22title_short%5E750+title_full_unstemmed%5E600+title_full%5E400+title%5E500+title_alt%5E200+title_new%5E100+series%5E50+series2%5E30+author%5E300+author_fuller%5E150+contents%5E10+topic_unstemmed%5E550+topic%5E500+geographic%5E300+genre%5E300+allfields_unstemmed%5E10+fulltext_unstemmed%5E10+allfields+fulltext+description+isbn+issn+long_lat_display%5C%22+mm%3D%5C%270%25%5C%27%7Dunited+states%22%29%29%29

also with the magic query

(((_query_:"{!edismax qf=\"title_short^750 title_full_unstemmed^600 title_full^400 title^500 title_alt^200 title_new^100 series^50 series2^30 author^300 author_fuller^150 contents^10 topic_unstemmed^550 topic^500 geographic^300 genre^300 allfields_unstemmed^10 fulltext_unstemmed^10 allfields fulltext description isbn issn long_lat_display\" mm=\'0%\'}united states")))

But this gives the same results.

liowalter commented 5 years ago

If I search "heidi AND spyri" in the advanced search, then it works :

https://www.swissbib.ch/Search/Results?sort=relevance&advancedSearchFormRequest=advancedSearchFormRequest&join=AND&bool0%5B%5D=AND&lookfor0%5B%5D=heidi+AND+spyri&type0%5B%5D=AllFields&lookfor0%5B%5D=&type0%5B%5D=AllFields&lookfor0%5B%5D=&type0%5B%5D=AllFields&daterange%5B%5D=publishDate&publishDatefrom=&publishDateto=&limit=20

Maybe something related to this :

https://github.com/vufind-org/vufind/blob/master/solr/vufind/biblio/conf/solrconfig.xml#L370

which is missing in our config ?

guenterh commented 5 years ago

Abfragen vufind / swissbib ohne HTML Encodierung

simple search vufind

http://localhost:8080/solr/biblio/select?fl=*,score&spellcheck=false&sort=score desc&hl=true&hl.simple.pre={{{{START_HILITE}}}}&hl.simple.post={{{{END_HILITE}}}}&wt=json&json.nl=arrarr&rows=20&start=0&qf=title_short^750 title_full_unstemmed^600 title_full^400 title^500 title_alt^200 title_new^100 series^50 series2^30 author^300 author_fuller^150 contents^10 topic_unstemmed^550 topic^500 geographic^300 genre^300 allfields_unstemmed^10 fulltext_unstemmed^10 allfields fulltext description isbn issn long_lat_display&qt=edismax&mm=0%&hl.fl=title_short,title_full_unstemmed,title_full,title,title_alt,title_new,series,series2,author,author_fuller,contents,topic_unstemmed,topic,geographic,genre,allfields_unstemmed,fulltext_unstemmed,allfields,fulltext,description,isbn,issn,long_lat_display&q=united states*

advanced search vufind http://localhost:8080/solr/biblio/select?fl=*,score&spellcheck=true&facet=true&facet.limit=30&facet.field=topic_facet&facet.field=institution&facet.field=building&facet.field=format&facet.field=callnumber-first&facet.field=author_facet&facet.field=language&facet.field=genre_facet&facet.field=era_facet&facet.field=geographic_facet&facet.field=publishDate&facet.sort=count&facet.mincount=1&sort=score desc&hl=true&hl.simple.pre={{{{START_HILITE}}}}&hl.simple.post={{{{END_HILITE}}}}&spellcheck.dictionary=default&wt=json&json.nl=arrarr&rows=20&start=0&spellcheck.q=united states&hl.fl=*&q=(((query:"{!edismax qf=\"title_short^750 title_full_unstemmed^600 title_full^400 title^500 title_alt^200 title_new^100 series^50 series2^30 author^300 author_fuller^150 contents^10 topic_unstemmed^550 topic^500 geographic^300 genre^300 allfields_unstemmed^10 fulltext_unstemmed^10 allfields fulltext description isbn issn long_lat_display\" mm=\'0%\'}united states")))

simple search swissbib http://localhost:8984/solr/green/select?fl=*,score&spellcheck=false&facet=true&facet.limit=100&facet.field={!ex=union_filter}union&facet.field={!ex=navAuthor_full_filter}navAuthor_full&facet.field={!ex=format_hierarchy_str_mv_filter}format_hierarchy_str_mv&facet.field={!ex=language_filter}language&facet.field=navSub_green&facet.field={!ex=navSubform_filter}navSubform&facet.field=publishDate&facet.sort=count&facet.mincount=1&sort=score desc&q.op=AND&hl=true&hl.simple.pre={{{{START_HILITE}}}}&hl.simple.post={{{{END_HILITE}}}}&hl.fl=fulltext&hl.fl=0,author,author_additional,author_additional_dsv11_txt_mv,author_additional_gnd_txt_mv,series,topic,addfields_txt_mv,publplace_txt_mv,publplace_dsv11_txt_mv,publplace_gnd_txt_mv,fulltext,localcode,title_short,title_alt,title,title_sub,title_old,title_new,title_additional_dsv11_txt_mv,title_additional_gnd_txt_mv,publplace_additional_gnd_txt_mv,callnumber,ctrlnum,publishDate,isbn,cancisbn_isn_mv,variant_isbn_isn_mv,issn,incoissn_isn_mv,id&hl.fragsize=250&wt=json&json.nl=arrarr&rows=20&start=0&qf=title_short^1000 title_alt^200 title^200 title_sub^200 title_old^200 title_new^200 author^750 author_additional^100 author_additional_dsv11_txt_mv^100 title_additional_dsv11_txt_mv^100 author_additional_gnd_txt_mv^100 title_additional_gnd_txt_mv^100 publplace_additional_gnd_txt_mv^100 series^200 topic^500 addfields_txt_mv^50 publplace_txt_mv^25 publplace_dsv11_txt_mv^25 fulltext callnumber^1000 ctrlnum^1000 publishDate isbn cancisbn_isn_mv variant_isbn_isn_mv issn incoissn_isn_mv localcode id&qt=edismax&pf=title_short^1000&ps=2&bf=recip(abs(ms(NOW/DAY,freshness)),3.16e-10,100,100)&mm=0%&q=heidi spyri

advances search swissbib http://localhost:8984/solr/green/select?fl=*,score&spellcheck=false&facet=true&facet.limit=100&facet.field={!ex=union_filter}union&facet.field={!ex=navAuthor_full_filter}navAuthor_full&facet.field={!ex=format_hierarchy_str_mv_filter}format_hierarchy_str_mv&facet.field={!ex=language_filter}language&facet.field=navSub_green&facet.field={!ex=navSubform_filter}navSubform&facet.field=publishDate&facet.sort=count&facet.mincount=1&sort=score desc&q.op=AND&hl=true&hl.simple.pre={{{{START_HILITE}}}}&hl.simple.post={{{{END_HILITE}}}}&hl.fl=fulltext&hl.fl=*&hl.fragsize=250&wt=json&json.nl=arrarr&rows=20&start=0&q=(((query:"{!edismax qf=\"title_short^1000 title_alt^200 title^200 title_sub^200 title_old^200 title_new^200 author^750 author_additional^100 author_additional_dsv11_txt_mv^100 title_additional_dsv11_txt_mv^100 author_additional_gnd_txt_mv^100 title_additional_gnd_txt_mv^100 publplace_additional_gnd_txt_mv^100 series^200 topic^500 addfields_txt_mv^50 publplace_txt_mv^25 publplace_dsv11_txt_mv^25 fulltext callnumber^1000 ctrlnum^1000 publishDate isbn cancisbn_isn_mv variant_isbn_isn_mv issn incoissn_isn_mv localcode id\" pf=\'title_short^1000\' ps=\'2\' bf=\'recip(abs(ms(NOW/DAY,freshness)),3.16e-10,100,100)\' mm=\'0%\'}heidi spyri")))