gbv / cmo

Corpus Musicae Ottomanicae
GNU General Public License v3.0
6 stars 0 forks source link

Expressions: Normalized Incipits #166

Closed fabiancremer closed 4 years ago

fabiancremer commented 5 years ago

Incipits should have a normalized/standardized version. Either via a vocabulary or a second incipit field with a type attribute.

use a controlling feature, e.g. autocomplete, or vocabulary, or classification

fabiancremer commented 5 years ago

Considering the content, a second field with a type attribute will be the best solution for this issue. It is impossible to create a vocabulary including all (incl. future) possible variants.

kkrebs commented 5 years ago

at the moment we use the label attribute for incipit differentiation, see https://corpus-musicae-ottomanicae.de/receive/cmo_expression_00002749

<incip>
  <incipText label="main" xml:lang="ota-arab"><p>Rāst getirüb fennile seyrėtdi hümāyı</p></incipText>
  <incipText xml:lang="ota-arab" label="original"><p>راست كتيروب فن ايله سير ايتدى همايى</p></incipText>
</incip>

We can specify a label "normalized" for this feature and put it in solr index for autocomplete functionality. But than we should use a drop down list for incipit labels?!

kkrebs commented 4 years ago

looks good for me. We should also change input field for incipit and allow only predefined types.

kkrebs commented 4 years ago

in production we have 2369 -Tags, only 234 have a label attribute. Values are: "main", "original", "standardized"

I will add a dropdown-box for this and change "standardized" to "normalized"