Open ingoboerner opened 1 year ago
here is an example of MIXED
: https://dracor.org/api/corpora/swe/play/strindberg-till-damaskus
DIVERSE see CalDraCor, e.g. { "id": "musicos", "name": "MÚSICOS", "isGroup": true, "sex": "DIVERSE" },
Thanks for collecting these variants! I think the examples above are well covered by "UNKNOWN" in our current use of the term in combination with elements person and personGrp (assuming that UNKNOWN can be anything from MIXED and DIVERSE to actually UNKNOWN – this is far from perfect, but acknowledging that we cannot really annotate MIXED or DIVERSE if we don't have a clear understanding of what this would mean for all of our thousands of plays since antiquity). It is a bit similar to the imperfect annotation of character relations, where e.g. "associated_with" covers such a range of things that it is near unusable for interesting queries. But it's a start and we can always further qualify our data at a later point.
Also, TEI guidelines 4.5.0 introduce a differentiation between sex and gender attributes. In the light of this, we have to find a clear annotation strategy for this kind of data before we make any adjustments in all our corpora. Until then, I would propose to fall back to UNKNOWN in cases like the ones you described.
Currently, schema allows:
MALE
,FEMALE
,UNKNOWN
(we have some spelling variations here:*UNKOWN
,*UNKWON
); for MALE*MAE
In some corpora (Cal but also Swe) there are other values, e.g.DIVERSE
andMIXED
... which, unlike the spelling errors, might be actually useful. Shall we extend the allowed values?