SEMICeu / Core-Person-Vocabulary

This is the issue tracker for the maintenance of Core Person Vocabulary
15 stars 4 forks source link

`CPV1.00gender` is nomina periculosa; not aligned with opinion on gender by World Health Organization et al.; de facto implementer's usage contains xenonyms; untranslatable #38

Closed fititnt closed 1 year ago

fititnt commented 2 years ago

As per Wiki documentation, this proposal is divided into I - submitter name and affiliation, portal; II - service or software product represented or affected; III - clear and concise description of the problem or requirement; IV - proposed solution, if any.

I - submitter name and affiliation

Emerson Rocha, from @EticaAI.

Some context: as part of @HXL-CPLP, the submitter is working both on community translation initiatives for humanitarian use and (because lack of usable data standards) specialized public domain tooling to manage and exchange multilingual terminology (so different lexicographers can compile results, without centralization).

II - portal, service or software product represented or affected

The list of portal/service/product would be too extensive to mention. The submitter will mention as group of users (without particular order of priority):

III - clear and concise description of the problem or requirement

Comments on nomenclature and symbols used on this request:

The main context of the issue

Before going to the proposals, some context is given. Sorry for being long, but it is hard to give an overview. Some parts may be relevant to counter-arguments or later to decide head terms or review definitions.

C.1 - The CPV1.00gender and CPV2.00gender(preview)

The current (near to be approved) Gender of the CPV2.00 (this is the root of main discussion) is this:

The gender of an individual should be recorded using a controlled vocabulary that is appropriate for the specific context. In some cases the chromosomal or physical state of an individual will be more important than the gender that they express, in others the reverse will be true. What is always important is that the controlled vocabulary used to describe an individual's gender is stated explicitly.

cpv-gender


C.2 - General comments on terms, origins and non-intuitive differences between languages

Note from submitter: the draft of this proposal had opinionated comments on the origin of the terms, including attempts on what would be a contemporary Latin version, since genus in latin would be more focused on contemporary racial aspects. Do in fact exist usage of genus in Latin for grammatical gender since 6th century, but then genus would be relevant for something like /third person preferred pronouns/ which is different from persons view of own sexuality. This full argument was removed from here to keep this proposal short.

C.2.1 Etymology of the words

Gender (English language)

"From Middle English gendre, gender (see also gendres), from Middle French gendre, genre, from Latin genus (“kind, sort”). Doublet of genre, genus and kin. The verb developed after the noun." -- https://en.wiktionary.org/wiki/gender#English

Genus (Latin language)

"From Proto-Italic genos, from Proto-Indo-European ǵénh₁os (“race”), from Proto-Indo-European ǵenh₁- (“to produce, beget”); compare also gēns, from the same root. Cognates include Ancient Greek γένος (génos, “race, stock, kin, kind”), Sanskrit जनस् (jánas, “race, class of beings”), Proto-Celtic genos (“birth; family”), and English kin." -- https://en.wiktionary.org/wiki/genus#Latin

genus n (genitive generis); third declension

  • birth, origin, lineage, descent
  • kind, type, class
  • species (of animal or plant), race (of people)
  • set, group (with common attributes)
  • (grammar) gender
  • (grammar) subtype of word

C.2.2 Trivia on non word-per-word equivalence between translations

Notes from submitter:

  • at least two European Union official languages do have a generic term for both "gender identity" + "biological sex". I'm citing this because:
    1. This is an example where head terms and definition of concepts must be very, very well crafted even without considering issues of having translations for languages outside of the EU. Only "sex" and "gender" , similar to what could happens on Portuguese, with "sexo" and "gênero" in teory could be self-suficient (except that "gênero" in Portuguese means grammatical gender), but to make extra sure no mistakes would be made, "biological sex" and "gender identity" would be more precise candidates.
    2. The mother languages can affect thinking when deciding terminology in another language. Knowledge in multiple languages is obviously positive, but it is relevant to construct terms that (if they can fail) at least would not generate confusion with something closely related.

Wikipedia, Sex and gender distinction

Link: https://en.wikipedia.org/wiki/Sex_and_gender_distinction#Distinction_in_linguistics

Since the social sciences now distinguish between biologically defined sex and socially constructed gender, the term gender is now also sometimes used by linguists to refer to social gender as well as grammatical gender. Some languages, such as German or Finnish, have no separate words for sex and gender. German, for example, uses "Biologisches Geschlecht" for biological sex, and "Soziales Geschlecht" for gender when making this distinction.[85] Traditionally, however, a distinction has been made by linguists between sex and gender, where sex refers primarily to the attributes of real-world entities – the relevant extralinguistic attributes being, for instance, male, female, non-personal, and indeterminate sex – and grammatical gender refers to a category, such as masculine, feminine, and neuter (frequently based on sex, but not exclusively so in all languages), that determines the agreement between nouns of different genders and associated words, such as articles and adjectives.[86]


C.3 World Health Organization on differentiation between /gender identity/ and /biological sex/

Source: https://www.who.int/news-room/questions-and-answers/item/gender-and-health (archive 1: https://archive.md/lEjPI; archive 2: http://web.archive.org/web/20211130061422/https://www.who.int/news-room/questions-and-answers/item/gender-and-health)

(...)

What is gender?

Gender refers to socially constructed characteristics of women and men – such as norms, roles and relations of and between groups of women and men[1]. Gender norms, roles and relations vary from society to society and evolve over time. They are often upheld and reproduced in the values, legislation, education systems, religion, media and other institutions of the society in which they exist. When individuals or groups do not “fit” established gender norms they often face stigma, discriminatory practices or social exclusion – all of which adversely affect health. Gender is also hierarchical and often reflects unequal relations of power, producing inequalities that intersect with other social and economic inequalities.

[1] World Health Organization. (2011). Gender mainstreaming for health managers: a practical approach. Geneva : World Health Organisation. (link: https://www.who.int/publications/i/item/9789241501057)

What is the difference between gender and sex?

Gender interacts with but is different from sex. The two terms are distinct and should not be used interchangeably. It can be helpful to think of sex as a biological characteristic and gender as a social construct. Sex refers to a set of biological attributes in humans and animals. Sex is mainly associated with physical and physiological features including chromosomes, gene expression, hormone level and function, and reproductive and sexual anatomy.

Sex is often categorized as females and males, but there are variations of sex characteristics called intersex. The term ‘intersex’ is used as an umbrella term for individuals born with natural variations in biological or physiological characteristics (including sexual anatomy, reproductive organs and/or chromosomal patterns) that do not fit traditional definitions of male or female[1]. Infants are generally assigned the sex of male or female at birth based on the appearance of their external anatomy/genitalia.

[1] UN High Commissioner for Refugees. (2021). Need to Know Guidance: Working with Lesbian, Gay, Bisexual, Transgender, Intersex and Queer Persons in Forced Displacement. (link: https://www.refworld.org/docid/4e6073972.html)

What is the difference between gender, sex, gender identity, gender expression and sexual orientation?

Gender identity refers to a person’s innate, deeply felt internal and individual experience of gender, which may or may not correspond to the person’s physiology or designated sex at birth.

Gender expression refers to how an individual expresses their gender identity, including dress and speech[1]. Gender expression is not always indicative of gender identity. ‘Transgender’ is an umbrella term for people whose gender identity and expression does not conform to the norms and expectations traditionally associated with the sex assigned to them at birth; it includes people who are transsexual, transgender or otherwise gender non-conforming[2].

Sexual orientation refers to a person’s physical, romantic and/or emotional attraction (or lack thereof) towards other people[3]. It encompasses hetero-, homo- and bisexuality and a wide range of other expressions of sexual orientation[4]. Sexual orientation cannot be assumed from one’s assigned sex at birth, gender identity or gender expression.

[1] World Health Organization. (2016). Frequently asked questions on health and sexual diversity: an introduction to key concepts. World Health Organization. (Link: https://www.who.int/publications-detail-redirect/WHO-FWC-GER-16.2)

[2] World Health Organization. (2016). Consolidated guidelines on HIV prevention, diagnosis, treatment and care for key populations, 2016 update. World Health Organization. (Link: https://www.who.int/publications-detail-redirect/9789241511124)

[3] World Health Organization. (2016). Frequently asked questions on health and sexual diversity: an introduction to key concepts. World Health Organization. (linl: https://www.who.int/publications-detail-redirect/WHO-FWC-GER-16.2)

[4] UN High Commissioner for Refugees. (2021). Need to Know Guidance: Working with Lesbian, Gay, Bisexual, Transgender, Intersex and Queer Persons in Forced Displacement. (Link: https://www.refworld.org/docid/4e6073972.html)


C.4 Gender wiki on differentiation between /gender identity/ and /biological sex/

Source: https://gender.fandom.com/wiki/Differences_Between_Gender_and_Sex

Differences Between Gender and Sex

The difference between Gender and Sex is one of the founding principles of modern understandings of gender. This principle states that gender identity - the internal psychological experience of gender - is separate from, and not necessary aligned with, the physical sex characteristics of your body. This can be difficult to understand for those whose gender identity matches their assigned sex (cisgender) but can result in discomfort and dysphoria for those whose internal experience of gender does not match how they are viewed by others and society (transgender).

According to the World Health Organisation, gender is considered to be a social construct which reflects the gender roles of a particular society. Because of this, gender can vary between cultures and over time. In Western culture, gender is seen as a binary choice between male and female, although there is increasing awareness of transgender and non-binary identities. Other cultures may already have three or more genders, such as recognition of Two-Spirit people in many Native American societies, or Hijra in India.


C.5 Office for National Statistics (ONS), UK (this country uses English as official language) on differentiation between /gender identity/ and /biological sex/

https://www.ons.gov.uk/economy/environmentalaccounts/articles/whatisthedifferencebetweensexandgender/2019-02-21 (archive: https://archive.md/RXA2S)

"Sex and gender are terms that are often used interchangeably but they are in fact two different concepts, even though for many people their sex and gender are the same. This article will clarify the differences between sex and gender and why these differences are important to understand, especially in research and data collection. How and why sex and gender is important for SDGs and the principle of “leave no one behind” will be considered. It includes the UK government position on these concepts. ONS has done a lot of research and participated in discussions to understand these terms."

2. Definitions and differences The UK government defines sex as:

  • referring to the biological aspects of an individual as determined by their anatomy, which is produced by their chromosomes, hormones and their interactions
  • generally male or female
  • something that is assigned at birth

The UK government defines gender as:

  • a social construction relating to behaviours and attributes based on labels of masculinity and femininity; gender identity is a personal, internal perception of oneself and so the gender category someone identifies with may not match the sex they were assigned at birth
  • where an individual may see themselves as a man, a woman, as having no gender, or as having a non-binary gender – where people identify as somewhere on a spectrum between man and woman

The World Health Organisation regional office for Europe describes sex as characteristics that are biologically defined, whereas gender is based on socially constructed features. They recognise that there are variations in how people experience gender based upon self-perception and expression, and how they behave.

C.6 Other EU contries on differentiation between /gender identity/ and /biological sex/

Notes from submitter:

  • Mostly to avoid make this post even overlong the submitter avoided done full per country, but is very likely that at least central government organizations related to healthcare make the difference between /gender identity/ and /biological sex/.
  • Actually, except by Core-Person-Vocabulary, is very hard to find any place were it is allowed to mix both concepts

C.7 General examples of what happens with confusion with different /biological sex/ is handled

Note from submitter: the CPV1.00 still mix /biological sex/ and /gender identity/ so it's one step behind.

C.7.1 HL7_GENDER_R1_I1_2021JAN / Sex and Gender Reporting in Payment for Care

Source: https://confluence.hl7.org/download/attachments/91996069/HL7_GENDER_R1_I1_2021JAN.pdf?version=1&modificationDate=1615475825250&api=v2 (no archive link, copyrighted material, quoted material is presented here by the submitter not just as fair use, but humanitarian use, as it can help protect lives this knowledge).

Some EHR systems have already begun to suggest tests or workflows based on sex or gender data which is often inaccurate in describing the needs of Transgender, gender-diverse, and intersex persons. For instance, a patient may need to switch their insurance “sex” for a procedure to avoid denial of coverage or to even be offered a procedure or test in the first place. Pharmacies may also have to administratively change “sex” for approvals for particular medications and then switch the “sex” back to avoid denial of coverage (per NCPDP page 11). In addition, providers may have to address dozens of automatically flagged lab results which are irrelevant to the patient but are nonetheless required due to compliance regulations (63).

Switching “sex” fields back and forth may trigger hundreds of new results or diagnostic warnings or messages, adding to the already significant issue of alert fatigue among medical providers. Further, clinicians may miss proper risk assessments based on whether the “correct” sex field is provided. For instance, a Trans woman who is marked as “male” may miss crucial breast cancer screenings, but a Trans woman who is marked as “female” may miss prostate cancer screenings. Only by including contextual data about gender identity, sex assigned at birth, organ inventories, hormone levels, and chromosomal makeup can these issues be sufficiently avoided.


C.7.2 HAN KOEHLE / Trans Inclusive Electronic Health Records Model Policy Brief

Note from submitter: this is one of the documents (which actually have an open license, that's important) which mentions the relevance of /organ inventory/. The submitter also agrees that /organ inventory/ is relevant not only for non-binary, but general population: is more common among binary people operations which removes organs. Also note that even if CPV do not include /organ inventory/ (which is fair, as CPV is Core Vocabulary), awareness that it it can be used can shape how use or not use /biological sex/ (but more specific variant, [Sex assigned at birth] plus /organ inventory/, both recommended only for strict medical proposes) could be mentioned as usage notes.

Source: https://hankoehleinfo.files.wordpress.com/2020/08/trans-inclusive-electronic-health-records.pdf

Note from submitter: No summary here. Please read the source if want more details. Very, very detailed.


C.7.3 Examples of how Canada encode /biological sex/ and /gender identity/

An Environmental Scan of Sex and Gender in Electronic Health Records: Analysis of Public Information Sources

Source: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7688387/

  • This paper talks about fields names and field values used when encoding /biological sex/ and /gender identity/ on Canada.
  • For variants of /biological sex/, exact terms used on this reference
    • [Sex]
    • [Administrative sex]
    • [Patient’s sex]
    • [Sex of patients]
    • [Patient sex (at birth)]
    • [US-core-birthsex]
    • [Sex assigned at birth]
    • [Sex for clinical use]
    • [Biological sex]
    • [Anatomic sex]
    • [Genotypic sex]
  • For variants of /gender identity/, some exact terms used on
    • [Gender]
    • [Gender identity]
    • [Administrative gender]
    • [Person stated gender]
    • [Code recorded gender or sex identity (previously legal gender)]
    • [Legal gender]
    • [Gender expression]
    • [Affirmed gender]
    • [Assigned gender]
    • [Preferred pronoun] ( different item )

Note from submitter: Pertinent to this proposal here: this article has detailed discussion about poorly defined concepts for items to entry on data fields which even inside a single country already are problematic. But since this proposal do not ask for coding values for input on /biological sex/ and /gender identity/, this reference serves for the need to at least start to divide already is a major step for CPV.


C.8 Examples of CPV1.00gender used as /biological sex/: xenonyms for /gender identity/

Note from submitter:

  • the submitter of this general proposal cannot affirm that only CPV1.00 caused implementators and creators of coding systems to to use "gender" to define /biological sex/. But there is a strong tendency on European Union (but not United Kingdom, which is strict on nomenclature) to use "gender" in English to define codes for data exchange that are actually /biological sex/

C.8.1 HL7 compilation of several code systems

For example this table by HL7 https://confluence.hl7.org/display/VOC/Gender+Coding+with+International+Data+Exchange+Standards (https://archive.ph/VQR42)

hl

Note from submitter: For a user from Western Culture which identifies with one (or more) options of Gender Identities such as https://gender.fandom.com/wiki/Category:Gender_Identities, an interface asks for an input named /gender identity/ and from the options have values such as "Female pseudohermaphrodite" is not what users expect.


C.8.2 CORE VOCABULARIES SPECIFICATION 1.00 / 3.1.6 gender (vague terms, but still xenonyms)

cpv100

Note from submitter:

  • The reasoning why "female"/"male" causes xenonyms at least on some languages will be explained on the proposals.
  • The tautological argument is the following: since /gender identity/ and /biological sex/ are different concepts, even the industry common practice of "reusing labels" is irrational
    • Even if in English this distinction on the language is not evident, other natural languages can differentiate labels by gender and by sex.

C.9 General comments on Translations issues (important as CPV is multilingual, with final result equally equivalent)

C.9.1 1980, Evaluation of the Translation Process in the United Nations System (50 pages)

Note from submitter: this document from the 80's is mostly to cite how bad was the situation before automation and also that the reason for UN translations today have much stricter process before start translations is because already in the 80's was clear that review of source material to translate was a significant major factor to translators. And, without this, translators would have to make very serious decisions which could have a serious impact.

Source: https://www.unjiu.org/sites/www.unjiu.org/files/jiu_document_files/products/en/reports-notes/JIU%20Products/JIU_REP_1980_7_English.pdf

  • Because they feel that they are viewed (when they are noticed) as non-creative appendages performing a costly but mechanical report processing function, they can come to view their work as a high-pressure but rather thankless and tedious task.

C.9.2 2008: Translation at the United Nations as Specialized Translation (16 pages)

Note from submitter: Another reference in addition to Europe Translators, is United Nations translations (which have stricter translation processes). The reason for citing this reference is the following: in the UN, the process to grant quality translations is a multi step process, where translations themselves are only done after strict quality control on source documents. On context of CPV, this means that Core Vocabularies should work as equivalent to Terminology, so its usage should also be strict, to a point of make translations easier and less error prone

Source: https://www.jostrans.org/issue09/art_cao.pdf

  • (1) Documentation programming and monitoring (...)
  • (2) Documents control (...)
  • (3) Editorial control (...)
  • (4) Reference and Terminology (...)
  • (5) Translation (...)
  • (6) Text processing and typographic style (...)
  • (7) Official Records (...)
  • (8) Copy preparation and proof-reading (...)
  • (9) Publishing (...)

C.9.3 2017, Interaction of law and language in the EU: Challenges of translating in multilingual environment (17 pages)

Note from submitter: in the context of Core-Person-Vocabulary, this is the document that is worth full reading to get an overview of translation issues inside the EU (it's the most specific; other references are mostly to prove the sources).

Source: https://www.jostrans.org/issue27/art_cavoski.pdf

Selected quotes

  • However, this new English language used in the EU context has nothing to do with the language of Shakespeare (GRASPE: 2003: 9). It is a novel version of the language, often called ‘EU English’ that is different from the English spoken in the UK or Ireland (Robertson 2012: 1234).
  • Despite the fact that legal texts in each of these languages are equally authentic (Article 55(1) TEU), there are differences between legal texts which can already be seen from Regulation No. 1, which regulates the use of official languages (Council Regulation No 1). If we look closely at Article 7 of this Regulation, we can identify difficulties in achieving equivalence between languages.
  • 3.1 Lack of precision and clarity.
    • The choice of English as a source language may prove challenging for several important reasons. EU legal texts in English very often contain imprecise terms, which is not something one would associate with traditional UK legal language. The importance of precision and clarity of provisions in the English common law system is greatly cherished both among academics and practitioners.
    • The impreciseness in EU legal texts in English often comes as a result of legal drafting by non-native English speakers who are not very familiar with the common law system and key legal concepts.
    • The inconsistency in using certain terminology is not unusual in EU legal texts in the English language. (...) As Šarčević (2010: 31) argues, terminological inconsistency on the part of the EU legislator results in multiple references causing incoherence, leading to legal uncertainty and inevitable linguistic diversity in the translations. (...) . Despite the fact that these terms occasionally tend to be used as a synonym in academic debates (Black’s Law Dictionary 2004: 1110), it is legal custom to always use the same term to denote an identical concept in a legally binding text. (...) This rule is strictly followed the legal text in German which consistently uses the expression criminal offences (Straftaten ‘criminal offences’).

C.10 Using scientific nomenclature as inspiration for "internal guidelines" to manage terms on Core Vocabularies

Notes from submitter:

  • the close guidelines for something similar to what EU Core Vocabularies are (already considering "languages are equally authentic (Article 55(1) TEU)", even if is not intended for worldwide usage these examples) are... scientific nomenclature
  • The likely most restricted of them (which have more restricted rules, like not allowing the creation of body parts names in someone else's name since this makes countries create their own standards) are the ones related to human anatomy. But these do not have internal guidelines documented.
  • This is the example used to reject names. (The analogy to "reject definitions" actually do exist, but will not write about on this post)

C.10.1 International Code of Nomenclature of Bacteria: Bacteriological Code, 1990 Revision. / Rejection of Names 56a

Source: https://www.ncbi.nlm.nih.gov/books/NBK8808/#A415

Only the Judicial Commission can place names on the list of rejected names (nomina rejicienda) (see Rule 23a, Note 4, and Appendix 4). A name may be placed on this list for various reasons, including the following.

  1. An ambiguous name (nomen ambiguum), i.e., a name which has been used with different meanings and thus has become a source of error.
    • Example: Aerobacter Beijerinck 1900 (Opinion 46).
  2. A doubtful name (nomen dubium), i.e., a name whose application is uncertain.
    • Example: Leuconostoc citrovorum (Opinion 45).
  3. A name causing confusion (nomen confusum), i.e., a name based upon a mixed culture.
    • Example: Malleomyces Hallier 1870.
  4. A perplexing name (nomen perplexum), a name whose application is known but which causes uncertainty in bacteriology (see Rule 57c).
    • Example: Bacillus limnophilus Bredemann and Stürck in Stürck 1935 (Greek-Greek, marsh loving) and Bacillus limophilus Migula 1900 (Latin-Greek, mud loving); see Index Bergeyana, p. 196.
  5. A perilous name (nomen periculosum), i.e., a name whose application is likely to lead to accidents endangering health or life or both or of serious economic consequences.
    • Example: Yersinia pseudotuberculosis subsp. pestis (Opinion 60) is to be rejected as a nomen periculosum.

Note 1. This application is restricted to a proposed change in the specific epithet of a nomenspecies which is widely recognized as contagious, virulent, or highly toxigenic, for example, to that of a subspecies of a species having a different host range or a degree of contagiousness or virulence. If the Judicial Commission recognizes a high order of risk to health, or of serious economic consequences, an Opinion may be issued that the taxon be maintained as a separate nomenspecies, without prejudice to the recognition or acceptance of its genetic relatedness to another taxon.

Notes from submitter:

  • The Yersinia pseudotuberculosis subsp. pestis actually do have a warning
    • Study case: "WHO: Warning on a new potential for laboratory-acquired infections as a result of the new nomenclature for the plague bacillus"
    • Source: https://apps.who.int/iris/bitstream/handle/10665/264875/PMC2536109.pdf?sequence=1&isAllowed=y
    • Additional related points:
    • Proposed nomenclature for worldwide usage was done without awareness of how it could be interacted with other contexts
      • In practice, it is not viable (in particular on nomenclature for live beings, which are named in huge numbers yearly) to do full evaluation of impacts. So trying to impose too much restriction for new terms would have a worse impact. Yet, there is no reason to evaluate after new evidence that something is not just wrong, but harmful.
    • Problem happened in particular with how things are abbreviated. (analogy: like CPV1.00gender which abbreviate two concepts, using gender as head term)
    • The human who "abbreviates" a thing (in our case, could be "values of data input") is not aware that the costume of an abbreviated thing will interpret another concept
    • It's "nomina periculosa" because the de facto usage of confusing both concepts can affect how the data will be used with deadly results.

That's it. It was the resumed context. The difference betwen /gender identity/ and /biological sex/ MUST be state explicitly, as in case of doubt, professionals related to medical area will priorize save lives. From dozens of references on this topic, actually there is no conflict betwen healtcare professionals and pro-/gender identity/ because implementers are instructed to use wrong code tables.

Not just //gender//@CPV1.00, but "gender" in the European Union is a lost cause. The "gender" term is so misused that even keeping an exact head term on a new dedicated concept is irresponsible.

IV - proposed solution, if any

TL;DR: from the 8 proposals, only 1, 2, 3 and 4 are more practical. The 5, 6, 7 and 8 are either suggestions to invite interested groups on these points or are subjective (this issue can be close without then).

Proposal 1: DO NOT release CPV2.00 without dividing CPV1.00gender to adhere to WHO et al.

The first proposal is only to release CPV2.00 with the nomina periculosa CPV1.00gender with 2 very explicitly different new concepts in such way that the CPV2.00 do not induce non conformance with World Health Organization opinion on gender and, by extension, local laws on countries that use CPV and endorse WHO.

The Proposal annex has more explicit suggestions on a strategy on how this division can be done. If necessary the submitter can try to make drafted definitions based on WHO glossaries, but even this would need proofreading, preferable with European Union translators help. In the mean time, /gender identity/ and /biological sex/ are used here in a more exact way to divide //gender//@CPV1.00, but not final term.

Proposal 2: /biological sex/ dedicated concept is really necessary

The submitter insists that no matter how the proposal 1 is done, and, as Proposal annex is a optionated suggestion, not the main proposal, the CPV1.00gender needs to have /biological sex/; mere rework on CPV1.00gender as more well defined /gender identity/ does not seems to solve all current issues.

The mere addition of /gender identity/ have realistic potential issues of implementers keep using CPV1.00gender.

Additional counter arguments for "a core vocabulary do not need to have concept of /biological sex/"

Most (if not all) coding vocabularies for data exchange used on Europe for /gender identity/ and /biological sex/ are actually /biological sex/ and this is not fault of SEMICeu. Despite CPV1.00 trying to bring up //gender//, near no real improvement of serious coding systems beyond some ontologies not ready for average data exchange exist as of 2021. The mere absense of /biological sex/ will not explain what are these /biological sex/ and will further allow implementers use they as if are /gender identity/.

Also note that, with official translations to 24 languages, without explicitly /biological sex/, the next review of CPV circa 2032, the head term of /gender identity/ on 2022 can become so misused (a new nomen ambiguum outside CPV) that a new term would be need to be created again. This already happened with "gender" in the European Union (including official documents) in the last decade.

Additional counter arguments for "privacy" to not add /biological sex/

The submitter, who does have experience and knows part of the total chaos in which data exchange for human rights lacks field standards and most front line human rights defenders only know local language reinforcers that sensitive fields in multiple languages are important. Not only this, but in practice even for sensitive data the most portable format to exchange is still Excel (not even CSV) so forgot any hope of advanced systems. Even for regions which do have such systems, human rights defenders who work with (just as one example) victim protection (from police, mafia or political persecution) intentionally will never use such centralized systems.

So, assuming good intentions of those who ask about privacy (with a naive thinking "is bad") a counter argument is that both data and software (even if it is spreadsheet macros) is necessary to allow interoperability of human rights technology. In the European context, quite often human rights defenders (HRDs) become targets when they consistently deviate from official government opinition (less about physical harm as is common in Latin America, and more about being fired from jobs or are subject to non-renewal of financial pay to keep the organization running). And I'm not talking about low HDI countries, but Nordic countries. People like Peter Benenson were right all this time.

The Core Vocabularies in general have a very, very good baseline (which can be used to improve variants based on something like the The Human Trafficking Case Data Standard (HTCDS) https://github.com/UNMigration/HTCDS), so /biological sex/ (along with /gender identity/) are greatly welcomed since this helps to improve tracking of human rights abuses related on this topic. Without this, such vagueness would not be possible to know for example attacks against trans persons. Another for statistical use would be /ethnic group/ (and I say this and not //race// because Europe has serious issues with its own minority groups, such as the Sámi people and Romani people, and since they are white often are not even on statistics despite being treated as subhumans not only to to be be ignore when reporting to police, but even have housing denied).

Proposal 3: make new usage notes explicitly forbid reuse from the respective different term

Tautologically speaking the mere act of reusing coding vocabulary from a different concept already is irrational.

On this subject, the most well respected sources explicitly state they should not be used interchangeably. The ones, vague, who allow reuse of terms (or don't warn about knowing misuse) don't prove from where they've got such a fact. This means both for end users and standards who uses CPV as reference for their systems:

Worst case scenario of reusing tables (but with improved labels)

For tables which reuse logic of /biological sex/ from 4 style options like ISO/IEC 5218 "sex" (0 = Not known; 1 = Male; 2 = Female; 9 = Not applicable), a toy example of how translated versions require differentiation (even for same coding vocabularies):

  • While it is hard to describe this issue in the language English itself, literal translations from other languages which differentiate use on /gender identity/ could be "1 - Masculine", "2 - Feminine". A problem with "Male" and "Female" is that exist languages were the translation is more more near "animal male sex" / "animal female sex". This is something expert translators could prepare to avoid translations by implementers
  • However, most /biological sex/ codings which already are vague/ambigous for the non-male and non-female often are hard to translate to be used on /gender identity/ because even professional human translators cannot understand redundant concepts already for /biological sex/ (More on https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7688387/).
  • In any case, it is still better to have translations/proofreading/revision at country or European Union level, as it is less wrong than no review or allow it happens in each implementation. But less wrong does not mean it's right. But at least not having to force terminology translators to create vague terms is an important step. Not to say that create definitions for the explicit different concepts become actually feasible.

Note to CPV editors on conflict of interests: for the discussion of this point I ask to preventively request from commenters to disclose if they're part (currently or on past) of external coding systems (like ones cited as example on CPV1.00gender) or "good practices guides" who would affected by this change of not allow indiscriminate reuse of tables.

Proposal 4: allow feedback from the professional translators (the "terminology translators") including on the initial reference head terms

Proposal 4.1: The first part of this proposal is to try not only draft the English terms for /gender identity/ and /biological sex/, but we try already have at least common more common working languages in the European Union already done (Wikipedia says the "procedural languages are English, French and German). This is mostly to aid the terminology variants (not need to be on UMLs and other images) and may be relevant to explain how the proposed terms on the procedural languages were defined since translators may try to search from that starting point. Translators, for example, may be interested in the etymology of terms and reference to external sources that talk about that subject.

Proposal 4.2: The second part of this proposal is to allow some way that the same person responsible for creating the terminology variants on other languages be able to provide feedback even on the initial proposed terms and at this later point, unless major issues are encountered, the CPV editors together with feedback from terminoy translators can define (and even change initial proposed head terms on procedural languages) without full current bureaucracy. The optimal scenario would be as part of the responsibility to create the variant's personnel who actually either work as terminologists or (more easily to find) as European translators. Since I'm not aware of research on issues for terminology, the 2017, Interaction of law and language in the EU: Challenges of translating in multilingual environment (17 pages) could give a hint of daily issues with terms even on documents used as enforceable laws.

It's important to explain to terminology translators that the terms, while not mere literal translation, should try to be as much as roundtrip translation resilient as possible in such a way that they become easier to be translatable when appearing on official documents. If, by request of translators, it is still necessary to have a "vague term" that means both meanings, while not necessary part of CPV, we may need to create such a term and let it be used on Europe IATE. For this vague term, this proposal uses /gender identity and biological sex/ and /gender identity or biological sex/ . One reason to have the non-official vague term is to test if translation roundtrips can eventually fallback too soon for the vague term and do exist situations where legal documents actually refer to both terms (but, again, the best practices for legal documents is use the same term for the same concept)

The submitter is not asking to do this on all new terms of CPV2.00, but only /gender identity/ and /biological sex/ head terms. The special reason is because CPV1.00gender was harder to improve without external help.

One impact of this Proposal 4 is that it has a chance of CPV2.00 that needs to be updated either on reference head term in English or minor improvements on the definition in English. The submitter suggests that CPV editors that as long as the CPV1.00gender is divided and the definitions are not explicitly against WHO opinion on this subject (since this reduces likelihood of breaking changes), the CPV2.00 could be released without further delay, but collaborators become aware these two concepts can receive minor corrections which already are linguistic related.

Proposal 5: contact WHO or WHO Regional Europe opinion

TL;DR: this proposes to send one or more invite emails to be considered done. Details are just extra context.

While the submitter is not fully sure how terminology departments of WHO are, this proposal is, after already having drafted definition of both new terms, CPV editors send one or more email inviting for feedback one or more contact points from World Health Organization. If not WHO, some Health related organization at Europe level could be also an extra contact point.

Note that this proposal does not mean that they need to wait for an official reply. Is just to invite. Is more likely that "terminology translators" (proposal 4) based on public glossaries from these organizations will already do the main work.

Proposal 6: contact one or more existing /gender identity/ related European union commission or organization

TL;DR: this proposes to send one or more invite emails to be considered done. Details are just extra context.

Similar to proposal 5, the suggestion here is to ask feedback about the definition and usage notes in particular of the /gender identity/ result. However the decision of head terms may still be better be drafted by the "terminology translators", proposal 4, (which would be focused on how it is used on CPV) than mere reuse of existing glossaries, so in case of disagreement, CPV editors could ask for them discuss directly with the "terminology translators".

Being realistically, the suggestions to CPV were open all this time and the general idea is to implement definitions from WHO (which already are often well accepted). So as long as the proto terms and definitions do not have strong opposition for serious issues, I still believe CPV editors could be free to not delay release of CPV2.00 by already with one reference of both two new concepts.

For example, if there is interest from these groups to add more features or start working on improved coding vocabularies designed for /gender identity/, this could easily take several months, if not more. Since the final result could not be predictable, then a different group could be focused on such a task and only when they start to become more well defined can they go back to propose to CPV as some extension.

Potential suggestion If one of these groups were to actually respond to feedback, one of the main interests would be to review some prototype coding vocabularies with at least better labels than existing ones. Realistic speaking, this is unlikely to happen without going after one or more groups. If this happens, then people like me could move discussion and for on their initiatives.

Note that there are some coding vocabularies but their license is not open and do not seem to have any strategy planning on translating to other languages. So having any European commission or working group who could validate ideas, is likely to interest people who collaborated on other initiatives, but often the gender coding vocabularies do not have focused attention at all.

Note that there are some coding vocabularies but their license is not open and do not seem to have any strategy planning on translating to other languages. So having any European commission or working group who could validate ideas, is likely to interest people who collaborated on other initiatives, but often the gender coding vocabularies do not have focused attention at all.

Meta-proposal 7: make usage of CPV1.00gender with release of CPV2.00 (or minor update, with more feedback from collaborators) as nomina periculosa

TL;DR: this proposal is mostly informative. This GitHub issue can be closed without waiting for it.

How Core Vocabularies are managed internally is something relatively new (for example, the drafted "Principles for creating good definitions" from 2021-10-05). SEMICeu Core Vocabularies are more near scientific nomenclature than average standard data standard. But at same time, most scientific nomenclatures do not have an active multilingual approach such as European Core vocabularies. So this proposal is mostly informative.

A relevant fact is that scientific nomenclature often works differently from software standards where there is deprecation: they even are retroactive by default. In botany the starting point will often be in 1753 (the year Carl Linnaeus first published Species Plantarum). For Core vocabularies, what would be the equivalent starting point is open to discussion, but could be at minimum CPV1.00 release date.

The general idea of submitter would be considered, at least as the result of CPV1.00gender, in similar way as if it would be a scientific nomenclature and apply the principle of nomina periculosa. By analogy, this would means:

Handling Core Vocabularies as if scientific nomenclature can have advantages when new evidence is found or (what can happen with Core Vocabularies and how hard is to find ideal definitions) release something and fix later. Nomina periculosa is the worst case, but the average is more common to have periodic decisions where a group of people can give retroactive explanation of a less clear concept or recommend a better terminology form a previously released concept.

This approach also works with the idea that even if Core Vocabulary contributors, despite open process, don't have participation of more specific experts on that subject (or people with linguistic knowledge to write it down) it still can be fixed later with retroactive impact.

There are some parts not mentioned here, like the tolerance for synonyms (or ideal terms) and even special situations where less ideal terms (which would violate newer rules) can be preserved. But in general, the focus is more on protecting human lives and avoiding economic harm; this is not the same thing as software deprecation. It embraces the fact that releases done by humans can have mistakes.

This proposal is mostly meta. Even if CPV implements the break concept in two, the changelogs on why this was done (or additional context) could be further explained on minor releases.

Meta-proposal 8: eventually some special explanation on why mix gender/sex is nomina periculosa

TL;DR: this proposal is mostly informative. This GitHub issue can be closed without waiting for it.

This proposal is similar to the meta proposal 7. But instead of focusing on CPV1.00gender or the new CPV2.00 implementers, be generic for standards and vocabularies. By generic I also mean not just European context, but similar issues happen on other continents. In English this is less relevant in England (and European countries which may actually reused CPV1.00gender in stricter ways), but definitely is applicable in the USA.

In the submitter's opinion, the CPV is by far the best reference on its area. We know definitions may not be as perfect, but it is really complicated to find such a level of care for example on how to divide persons names. And I say this from a context where often software for humanitarian use terms such as the problematic "first name" / "last name" which are problematic to a point I'm really sure people are missing because of poor form inputs. My point here is that CPV1.00gender was an exception on the overall quality of the CPV 1.00 even if it was not as clear on the release of CPV 1.00.

For example, it's not clear for the submitter how far CPV1.00 was inspired on implementations of what "gender" means from the USA (both the country and ontologies and data schemas trying to use English as used in the USA) because the working language of the group is English. But different from terminology on break persons names, this subject may not be as easy to find experts to get engaged on terminology, so despite CPV being open to collaboration, people are likely to have fear of giving their opinion. Or they do, but only as a glossary for their own organization (not on initiatives like CPV).

So, this proposal here is, after proposal 7, if there are people interested in explaining, have some reference which could be cited as generic explanation for works which deal with both concepts. This could greatly accelerate common issues worldwide on data exchange related to this.

Proposal annex

Note: if relevant, in addition to explain, I can try to create the definitions based on WHO glossaries.

Proposal annex-1: /gender identity/

This proposal suggests /gender identity/@eng-Latn as a dedicated term where "identity" can be a different word. Yet in eng-Latn it makes sense to have a composed term explicitly.

To avoid confusion both with previous CPV 1.00 (and also with vagueness of //gender// on EU and US, as opposed to UK) submitters strongly recommend forbid abbreviations to //gender// on any official document. This approach also takes into consideration the needs of translators and potential automated proofreading and computer assistive technology, which means if current CPV editors are able to consult experts on this area, the CPV2.00 could intentionally be designed to be fail safe on every language combination.

The final description of this concept could have a quick review if it is aligned (or at least not opposed, in case of need to add more sentences and most references are too verbose) with WHO statements on the subject. Note that major groups related to gender identity tend to also have similar definitions to WHO, so the final redaction doesn't need to be a direct quote and the main issue with the current version is that it mixes two concepts in one. Said in other terms: even if the final definition does not say "this is based on a glossary definition of this term from name-of-organization", at least if years later, someone asks the source, we already have this information on additional information outside the CPV final document itself.

Proposal annex-2: /biological sex/

This proposal also suggest /biological sex/@eng-Latn as dedicated term where "biological" can be a different (like [Administrative sex], as one of the exemples from the Canada paper and may actually be more similar to what Core Vocabularies could use), but in eng-Latn makes sense have composed term explicitly.

Similar logic of the proposal 1 apply here (quick review if definition not in conflict with WHO, try decide terms which could be optimized to proofreading and translators, suggest not abbreviation to //sex//@eng-Latn. But there are some additional comments:

The /biological sex/@eng-Latn definition could be focused on "current /biological sex/" (which allow divide CPV1.00gender on /gender identity/ and /biological sex/). Some comments:

The submitter is aware that can be more variations of /biological sex/, but note that they are likely to have more reusable values between themselves than /biological sex/ vs /gender identity/. So for CPV2.00 breaking in two concepts is already a huge step.

Proposal annex-3: intentional NO direct replacement for CPV1.00gender

The submitter doubts any real need to have a single term for concepts which are likely to cause confusion. For example:

Two reasons for not having a single concept for both concepts:

But what if it is really necessary to have a generic data field concept? For example, old data, which is not clear if was /gender identity/ or /biological sex/? For worst case scenarios, references like HL7_GENDER_R1_I1_2021JAN use "Recorded Sex or Gender" and even then strongly discourage keep update such fields for new data. Even if CPV2.00 implements such a feature, not only it's better not "be automatically" upgraded (for example: implementers may look at their current data and detect that it can be mapped to new fields) but also discourage such usage.

Note that the main point here is that no direct replacement has a valid reason.

EmidioStani commented 1 year ago

"human sex" is used already in Publications Office" https://op.europa.eu/en/web/eu-vocabularies/concept-scheme/-/resource?uri=http://publications.europa.eu/resource/authority/human-sex

In addition "Sex" is used in the Public Documents Regulation: https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:32016R1191&from=EN#d1e41-20-1 which can be considered a subset of the "human sex".

Concerning "gender" and "sex" the following vocabularies adopt different approaches: 1) FOAF: http://xmlns.com/foaf/0.1/#term_gender (there is only gender which doesn't distinguish from biological, social or sexual concepts) 2) schema.org: https://schema.org/Person (using only gender but not sex) 3) wikidata: Gender (characteristics distinguishing between femininity and masculinity) and Sex (trait that determines an individual's sexually reproductive function - biological sex) 3) HL7 FHIR: https://hl7.org/fhir/patient.html (using only gender but not sex) 4) NIEM: https://github.com/NIEM/NIEM-Releases/blob/niem-5.2beta1/xsd/niem-core.xsd#L3758 (using SexAbstract and SexualOrientation but not gender)

So the proposal is to add "human sex" in alignment with Publications Office

EmidioStani commented 1 year ago

This issue is solved in release 2.1.0