dracor-org / engdracor

English Drama Corpus
Other
0 stars 1 forks source link

Refine mapping of text classes #13

Open cmil opened 2 years ago

cmil commented 2 years ago

The original EarlyPrint documents assign the following values in their ep:subgenre elements:

In https://github.com/dracor-org/epdracor/pull/12/commits/97fa5ed5d9ad9d631c94d01f3aba7756f3093cf1 the mapping of "comedy", "tragedy" and "tragicomedy" to their respective Wikidata IDs has been implemented. We should decide if and how to map the remaining subgenres.

cmil commented 7 months ago

@lucagiovannini7, @peertrilcke The DraCor API currently supports the text classes "Comedy", "Tragedy", "Tragicomedy", "Satyr play" and "Libretto" via their respective Wikidata IDs (see https://github.com/dracor-org/dracor-api/blob/0fe102e735708e4f3cfe3e736d54b23a3a1a9514/modules/config.xqm#L77). Should we

  1. just ignore the EarlyPrint ep:subgenres that are not exactly match those text classes,
  2. somehow map the at least some of the ep:subgenres to our text classes, or
  3. add more text classes to the DraCor API to support more ep:subgenres?

This may also be relevant for https://github.com/dracor-org/dracor-schema/issues/52.

peertrilcke commented 7 months ago

I would prefer a combination of 1 and 2: We ignore the corpus-specific EP subgenres (1). In addition, @lucagiovannini7 could successively make a careful adaptation of the obvious EP subgenres to the subgenres used by DraCor so far (2) I lack a systematic approach to (3). Here we should define an extended controlled vocabulary in perspective. In my opinion, however, this is a larger and different task.