dracor-org / engdracor

English Drama Corpus
Other
0 stars 1 forks source link

Improving genre tag in <textClass> #55

Closed lucagiovannini7 closed 7 months ago

lucagiovannini7 commented 7 months ago

Plays where the textual genre is indicated follow this simpler markup

 <textClass>
  <classCode scheme="http://www.wikidata.org/entity/">Q80930</classCode>
 </textClass>

instead of the DraCor standard

<textClass>
  <keywords>
    <term type="genreTitle">Comedy</term>
  </keywords>
  <classCode scheme="http://www.wikidata.org/entity/">Q40831</classCode>
</textClass>

Should we reintroduce the keywords element?

cmil commented 7 months ago

This issue more or less duplicates #13. To paraphrase: the XSL transformation already creates the textClass/classCode markup. It omits the keywords element because this is currently not used by the API and is kind of redundant. The remaining question is do we want to somehow map the sub genres occurring in the EarlyPrint sources to the textClasses currently recognised and supported by DraCor (which are the one defined in https://github.com/dracor-org/dracor-api/blob/0fe102e735708e4f3cfe3e736d54b23a3a1a9514/modules/config.xqm#L77).

See also https://github.com/dracor-org/dracor-schema/issues/52.

lucagiovannini7 commented 7 months ago

Thanks! This also answers a point we were discussing today with @peertrilcke: the <normalizedGenre> in the metadata table is actually generated from the Wikidata items via mapping.

cmil commented 7 months ago

Let's discuss remaining questions in #13.