gbv / cmo

Corpus Musicae Ottomanicae
GNU General Public License v3.0
6 stars 0 forks source link

make search masks not case-sensitive / diacritics #169

Closed fabiancremer closed 5 years ago

fabiancremer commented 5 years ago

The generic search fields searches for diacritics when normalized signs are used. The fields in the search masks should act the same.

Example: "atimi bagladim" finds https://corpus-musicae-ottomanicae.de/receive/cmo_expression_00000888 only when placed in the generic search field.

sebhofmann commented 5 years ago

The search "atimi bagladim" matches because of the incip. The incip is indexed as text_tr, if you search for the incip in the search mask then it will match. The title is indexed as text_ar because xml:lang is ota-arab.

text_ar indexes Atımı bağladım as atımı bağladım and queries atimi bagladim as atimi bagladim

text_tr indexes Atımı bağladım as at bagladı and queries atimi bagladim as at bagladı

I can not tell if the language wrong here or if solr does not work right.

annplaksin commented 5 years ago

It seems like the incipit in the example is labelled as ota-arab. That label seems wrong to me at first glance... Is the indexing somehow corrected or why is it indexed as text_tr?

annplaksin commented 5 years ago

The language labels for title and incipit in the example has been changed to turk-latin. Maybe it is possible to check if solr works correct when the index for the title is updated.

kkrebs commented 5 years ago

fixed in d99faa565480d5286407665a11bc198aff4a674e