Closed fititnt closed 1 year ago
"human sex" is used already in Publications Office" https://op.europa.eu/en/web/eu-vocabularies/concept-scheme/-/resource?uri=http://publications.europa.eu/resource/authority/human-sex
In addition "Sex" is used in the Public Documents Regulation: https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:32016R1191&from=EN#d1e41-20-1 which can be considered a subset of the "human sex".
Concerning "gender" and "sex" the following vocabularies adopt different approaches: 1) FOAF: http://xmlns.com/foaf/0.1/#term_gender (there is only gender which doesn't distinguish from biological, social or sexual concepts) 2) schema.org: https://schema.org/Person (using only gender but not sex) 3) wikidata: Gender (characteristics distinguishing between femininity and masculinity) and Sex (trait that determines an individual's sexually reproductive function - biological sex) 3) HL7 FHIR: https://hl7.org/fhir/patient.html (using only gender but not sex) 4) NIEM: https://github.com/NIEM/NIEM-Releases/blob/niem-5.2beta1/xsd/niem-core.xsd#L3758 (using SexAbstract and SexualOrientation but not gender)
So the proposal is to add "human sex" in alignment with Publications Office
This issue is solved in release 2.1.0
As per Wiki documentation, this proposal is divided into I - submitter name and affiliation, portal; II - service or software product represented or affected; III - clear and concise description of the problem or requirement; IV - proposed solution, if any.
I - submitter name and affiliation
Emerson Rocha, from @EticaAI.
Some context: as part of @HXL-CPLP, the submitter is working both on community translation initiatives for humanitarian use and (because lack of usable data standards) specialized public domain tooling to manage and exchange multilingual terminology (so different lexicographers can compile results, without centralization).
II - portal, service or software product represented or affected
The list of portal/service/product would be too extensive to mention. The submitter will mention as group of users (without particular order of priority):
CPV1.00gender
, despite being an Controlled vocabulary, is by design untranslatable (varies with context), so are official EU documents based onCPV1.00gender
can be less reliable.CPV2.00gender(preview)
definition is circular, but head term ofCPV1.00gender
, "gender" (vs "gender or sex"@eng-Latn-GB), is objective wrong translation on 22 out of 24 (likely exceptions: German and Finnish); 2 / 24 * 100 = 8,33% accuracy on head terms. Translation issue: literal translation of concept "/gender identity/
or/biological sex/
" as single term//gender//
/biological sex/
and/gender identity/
are different concepts and such distinction is very, very important (from health, societal, statistical, and more); even groups representing non-binary often cite WHO as referenceCPV1.00gender
/CPV2.00gender(preview)
concept is ambiguous, even for a limited number of translations where head terms which grammar and vocabulary are right, the local EU Governments needs to be consulted if they agree that such concepts can co-exist as a single term on a stricter vocabulary/biological sex/
and/gender identity/
, in whichCPV1.00gender
not only have de facto xenonyms, but put their lives in danger AND general population which "equal"/biological sex/
and/gender identity/
, but removal of non-vital human organs (breast, prostate, gallbladder, appendix, kidney, ovary and others) also is represented for of same approach of the first example group/biological sex/
and/gender identity/
be different concepts to avoid conflict with implementations that are against regional binding laws./organ inventory/
and/third person preferred pronouns/
, while cited here to explain limitations, explicitly are not part of this request./gender identity/
fields (proposal 3)III - clear and concise description of the problem or requirement
Comments on nomenclature and symbols used on this request:
/
//
[
[[
notation:/biological sex/
and/gender identity/
means add extra term to make sure roudtrip translations would be resilient, but I did not used[biological sex]
or[gender identity]
because I'm not sure if this would be the best failsafe combo. This syntax inspired on a non-standard International Phonetic Alphabet usage of delimiters. A quick explanation would be:/vague term/
//extra vague term//
//gender//
, as in CPV 1.00 and very likely European Union official documents (On UK it wouldn't be //)[precise term]
[Biologisches Geschlecht]
,[Geschlechtsidentität]
[[extra precise term]]
[[praenōmen]]
, https://en.wiktionary.org/wiki/praenomen#LatinThat's it. It was the resumed context. The difference betwen
/gender identity/
and/biological sex/
MUST be state explicitly, as in case of doubt, professionals related to medical area will priorize save lives. From dozens of references on this topic, actually there is no conflict betwen healtcare professionals and pro-/gender identity/
because implementers are instructed to use wrong code tables.Not just
//gender//
@CPV1.00, but "gender" in the European Union is a lost cause. The "gender" term is so misused that even keeping an exact head term on a new dedicated concept is irresponsible.IV - proposed solution, if any
Proposal 1: DO NOT release CPV2.00 without dividing
CPV1.00gender
to adhere to WHO et al.The first proposal is only to release CPV2.00 with the nomina periculosa
CPV1.00gender
with 2 very explicitly different new concepts in such way that the CPV2.00 do not induce non conformance with World Health Organization opinion on gender and, by extension, local laws on countries that use CPV and endorse WHO.The Proposal annex has more explicit suggestions on a strategy on how this division can be done. If necessary the submitter can try to make drafted definitions based on WHO glossaries, but even this would need proofreading, preferable with European Union translators help. In the mean time,
/gender identity/
and/biological sex/
are used here in a more exact way to divide//gender//
@CPV1.00, but not final term.Proposal 2:
/biological sex/
dedicated concept is really necessaryThe submitter insists that no matter how the proposal 1 is done, and, as Proposal annex is a optionated suggestion, not the main proposal, the
CPV1.00gender
needs to have/biological sex/
; mere rework onCPV1.00gender
as more well defined/gender identity/
does not seems to solve all current issues.The mere addition of
/gender identity/
have realistic potential issues of implementers keep usingCPV1.00gender
.Additional counter arguments for "a core vocabulary do not need to have concept of
/biological sex/
"Most (if not all) coding vocabularies for data exchange used on Europe for
/gender identity/
and/biological sex/
are actually/biological sex/
and this is not fault of SEMICeu. Despite CPV1.00 trying to bring up//gender//
, near no real improvement of serious coding systems beyond some ontologies not ready for average data exchange exist as of 2021. The mere absense of/biological sex/
will not explain what are these/biological sex/
and will further allow implementers use they as if are/gender identity/
.Also note that, with official translations to 24 languages, without explicitly
/biological sex/
, the next review of CPV circa 2032, the head term of/gender identity/
on 2022 can become so misused (a new nomen ambiguum outside CPV) that a new term would be need to be created again. This already happened with "gender" in the European Union (including official documents) in the last decade.Additional counter arguments for "privacy" to not add
/biological sex/
The submitter, who does have experience and knows part of the total chaos in which data exchange for human rights lacks field standards and most front line human rights defenders only know local language reinforcers that sensitive fields in multiple languages are important. Not only this, but in practice even for sensitive data the most portable format to exchange is still Excel (not even CSV) so forgot any hope of advanced systems. Even for regions which do have such systems, human rights defenders who work with (just as one example) victim protection (from police, mafia or political persecution) intentionally will never use such centralized systems.
So, assuming good intentions of those who ask about privacy (with a naive thinking "is bad") a counter argument is that both data and software (even if it is spreadsheet macros) is necessary to allow interoperability of human rights technology. In the European context, quite often human rights defenders (HRDs) become targets when they consistently deviate from official government opinition (less about physical harm as is common in Latin America, and more about being fired from jobs or are subject to non-renewal of financial pay to keep the organization running). And I'm not talking about low HDI countries, but Nordic countries. People like Peter Benenson were right all this time.
The Core Vocabularies in general have a very, very good baseline (which can be used to improve variants based on something like the The Human Trafficking Case Data Standard (HTCDS) https://github.com/UNMigration/HTCDS), so
/biological sex/
(along with/gender identity/
) are greatly welcomed since this helps to improve tracking of human rights abuses related on this topic. Without this, such vagueness would not be possible to know for example attacks against trans persons. Another for statistical use would be/ethnic group/
(and I say this and not//race//
because Europe has serious issues with its own minority groups, such as the Sámi people and Romani people, and since they are white often are not even on statistics despite being treated as subhumans not only to to be be ignore when reporting to police, but even have housing denied).Proposal 3: make new usage notes explicitly forbid reuse from the respective different term
Tautologically speaking the mere act of reusing coding vocabulary from a different concept already is irrational.
On this subject, the most well respected sources explicitly state they should not be used interchangeably. The ones, vague, who allow reuse of terms (or don't warn about knowing misuse) don't prove from where they've got such a fact. This means both for end users and standards who uses CPV as reference for their systems:
/biological sex/
usage notes: coding systems designed to/gender identity/
are incompatible with this field./gender identity/
usage notes: coding systems designed to/biological sex/
are incompatible with this field.Note to CPV editors on conflict of interests: for the discussion of this point I ask to preventively request from commenters to disclose if they're part (currently or on past) of external coding systems (like ones cited as example on CPV1.00gender) or "good practices guides" who would affected by this change of not allow indiscriminate reuse of tables.
Proposal 4: allow feedback from the professional translators (the "terminology translators") including on the initial reference head terms
Proposal 4.1: The first part of this proposal is to try not only draft the English terms for
/gender identity/
and/biological sex/
, but we try already have at least common more common working languages in the European Union already done (Wikipedia says the "procedural languages are English, French and German). This is mostly to aid the terminology variants (not need to be on UMLs and other images) and may be relevant to explain how the proposed terms on the procedural languages were defined since translators may try to search from that starting point. Translators, for example, may be interested in the etymology of terms and reference to external sources that talk about that subject.Proposal 4.2: The second part of this proposal is to allow some way that the same person responsible for creating the terminology variants on other languages be able to provide feedback even on the initial proposed terms and at this later point, unless major issues are encountered, the CPV editors together with feedback from terminoy translators can define (and even change initial proposed head terms on procedural languages) without full current bureaucracy. The optimal scenario would be as part of the responsibility to create the variant's personnel who actually either work as terminologists or (more easily to find) as European translators. Since I'm not aware of research on issues for terminology, the 2017, Interaction of law and language in the EU: Challenges of translating in multilingual environment (17 pages) could give a hint of daily issues with terms even on documents used as enforceable laws.
It's important to explain to terminology translators that the terms, while not mere literal translation, should try to be as much as roundtrip translation resilient as possible in such a way that they become easier to be translatable when appearing on official documents. If, by request of translators, it is still necessary to have a "vague term" that means both meanings, while not necessary part of CPV, we may need to create such a term and let it be used on Europe IATE. For this vague term, this proposal uses
/gender identity and biological sex/
and/gender identity or biological sex/
. One reason to have the non-official vague term is to test if translation roundtrips can eventually fallback too soon for the vague term and do exist situations where legal documents actually refer to both terms (but, again, the best practices for legal documents is use the same term for the same concept)The submitter is not asking to do this on all new terms of CPV2.00, but only
/gender identity/
and/biological sex/
head terms. The special reason is becauseCPV1.00gender
was harder to improve without external help.One impact of this Proposal 4 is that it has a chance of CPV2.00 that needs to be updated either on reference head term in English or minor improvements on the definition in English. The submitter suggests that CPV editors that as long as the
CPV1.00gender
is divided and the definitions are not explicitly against WHO opinion on this subject (since this reduces likelihood of breaking changes), the CPV2.00 could be released without further delay, but collaborators become aware these two concepts can receive minor corrections which already are linguistic related.Proposal 5: contact WHO or WHO Regional Europe opinion
While the submitter is not fully sure how terminology departments of WHO are, this proposal is, after already having drafted definition of both new terms, CPV editors send one or more email inviting for feedback one or more contact points from World Health Organization. If not WHO, some Health related organization at Europe level could be also an extra contact point.
Note that this proposal does not mean that they need to wait for an official reply. Is just to invite. Is more likely that "terminology translators" (proposal 4) based on public glossaries from these organizations will already do the main work.
Proposal 6: contact one or more existing
/gender identity/
related European union commission or organizationSimilar to proposal 5, the suggestion here is to ask feedback about the definition and usage notes in particular of the
/gender identity/
result. However the decision of head terms may still be better be drafted by the "terminology translators", proposal 4, (which would be focused on how it is used on CPV) than mere reuse of existing glossaries, so in case of disagreement, CPV editors could ask for them discuss directly with the "terminology translators".Being realistically, the suggestions to CPV were open all this time and the general idea is to implement definitions from WHO (which already are often well accepted). So as long as the proto terms and definitions do not have strong opposition for serious issues, I still believe CPV editors could be free to not delay release of CPV2.00 by already with one reference of both two new concepts.
For example, if there is interest from these groups to add more features or start working on improved coding vocabularies designed for
/gender identity/
, this could easily take several months, if not more. Since the final result could not be predictable, then a different group could be focused on such a task and only when they start to become more well defined can they go back to propose to CPV as some extension.Potential suggestion If one of these groups were to actually respond to feedback, one of the main interests would be to review some prototype coding vocabularies with at least better labels than existing ones. Realistic speaking, this is unlikely to happen without going after one or more groups. If this happens, then people like me could move discussion and for on their initiatives.
Note that there are some coding vocabularies but their license is not open and do not seem to have any strategy planning on translating to other languages. So having any European commission or working group who could validate ideas, is likely to interest people who collaborated on other initiatives, but often the gender coding vocabularies do not have focused attention at all.
Note that there are some coding vocabularies but their license is not open and do not seem to have any strategy planning on translating to other languages. So having any European commission or working group who could validate ideas, is likely to interest people who collaborated on other initiatives, but often the gender coding vocabularies do not have focused attention at all.
Meta-proposal 7: make usage of
CPV1.00gender
with release of CPV2.00 (or minor update, with more feedback from collaborators) as nomina periculosaHow Core Vocabularies are managed internally is something relatively new (for example, the drafted "Principles for creating good definitions" from 2021-10-05). SEMICeu Core Vocabularies are more near scientific nomenclature than average standard data standard. But at same time, most scientific nomenclatures do not have an active multilingual approach such as European Core vocabularies. So this proposal is mostly informative.
A relevant fact is that scientific nomenclature often works differently from software standards where there is deprecation: they even are retroactive by default. In botany the starting point will often be in 1753 (the year Carl Linnaeus first published Species Plantarum). For Core vocabularies, what would be the equivalent starting point is open to discussion, but could be at minimum CPV1.00 release date.
The general idea of submitter would be considered, at least as the result of
CPV1.00gender
, in similar way as if it would be a scientific nomenclature and apply the principle of nomina periculosa. By analogy, this would means:CPV1.00gender
was not only deprecated, but considered a nomina periculosa (as justified earlier): its usage may lead to accidents endangering life or health;CPV1.00gender
that depends fully on how it was defined should be considered potentially as harmful asCPV1.00gender
itself.CPV1.00gender
but the context of usage or more strict explanation make the combined result still fully aligned withCPV2.00/gender identity/
andCPV2.00/biological sex/
are perfect fine even ifCPV1.00gender
was not.CPV1.00gender
still compliant to the new version (perfect case).Handling Core Vocabularies as if scientific nomenclature can have advantages when new evidence is found or (what can happen with Core Vocabularies and how hard is to find ideal definitions) release something and fix later. Nomina periculosa is the worst case, but the average is more common to have periodic decisions where a group of people can give retroactive explanation of a less clear concept or recommend a better terminology form a previously released concept.
This approach also works with the idea that even if Core Vocabulary contributors, despite open process, don't have participation of more specific experts on that subject (or people with linguistic knowledge to write it down) it still can be fixed later with retroactive impact.
There are some parts not mentioned here, like the tolerance for synonyms (or ideal terms) and even special situations where less ideal terms (which would violate newer rules) can be preserved. But in general, the focus is more on protecting human lives and avoiding economic harm; this is not the same thing as software deprecation. It embraces the fact that releases done by humans can have mistakes.
This proposal is mostly meta. Even if CPV implements the break concept in two, the changelogs on why this was done (or additional context) could be further explained on minor releases.
Meta-proposal 8: eventually some special explanation on why mix gender/sex is nomina periculosa
This proposal is similar to the meta proposal 7. But instead of focusing on
CPV1.00gender
or the new CPV2.00 implementers, be generic for standards and vocabularies. By generic I also mean not just European context, but similar issues happen on other continents. In English this is less relevant in England (and European countries which may actually reusedCPV1.00gender
in stricter ways), but definitely is applicable in the USA.In the submitter's opinion, the CPV is by far the best reference on its area. We know definitions may not be as perfect, but it is really complicated to find such a level of care for example on how to divide persons names. And I say this from a context where often software for humanitarian use terms such as the problematic "first name" / "last name" which are problematic to a point I'm really sure people are missing because of poor form inputs. My point here is that
CPV1.00gender
was an exception on the overall quality of the CPV 1.00 even if it was not as clear on the release of CPV 1.00.For example, it's not clear for the submitter how far CPV1.00 was inspired on implementations of what "gender" means from the USA (both the country and ontologies and data schemas trying to use English as used in the USA) because the working language of the group is English. But different from terminology on break persons names, this subject may not be as easy to find experts to get engaged on terminology, so despite CPV being open to collaboration, people are likely to have fear of giving their opinion. Or they do, but only as a glossary for their own organization (not on initiatives like CPV).
So, this proposal here is, after proposal 7, if there are people interested in explaining, have some reference which could be cited as generic explanation for works which deal with both concepts. This could greatly accelerate common issues worldwide on data exchange related to this.
Proposal annex
Note: if relevant, in addition to explain, I can try to create the definitions based on WHO glossaries.
Proposal annex-1:
/gender identity/
This proposal suggests
/gender identity/
@eng-Latn as a dedicated term where "identity" can be a different word. Yet in eng-Latn it makes sense to have a composed term explicitly.To avoid confusion both with previous CPV 1.00 (and also with vagueness of
//gender//
on EU and US, as opposed to UK) submitters strongly recommend forbid abbreviations to//gender//
on any official document. This approach also takes into consideration the needs of translators and potential automated proofreading and computer assistive technology, which means if current CPV editors are able to consult experts on this area, the CPV2.00 could intentionally be designed to be fail safe on every language combination.The final description of this concept could have a quick review if it is aligned (or at least not opposed, in case of need to add more sentences and most references are too verbose) with WHO statements on the subject. Note that major groups related to gender identity tend to also have similar definitions to WHO, so the final redaction doesn't need to be a direct quote and the main issue with the current version is that it mixes two concepts in one. Said in other terms: even if the final definition does not say "this is based on a glossary definition of this term from name-of-organization", at least if years later, someone asks the source, we already have this information on additional information outside the CPV final document itself.
Proposal annex-2:
/biological sex/
This proposal also suggest
/biological sex/
@eng-Latn as dedicated term where "biological" can be a different (like[Administrative sex]
, as one of the exemples from the Canada paper and may actually be more similar to what Core Vocabularies could use), but in eng-Latn makes sense have composed term explicitly.Similar logic of the proposal 1 apply here (quick review if definition not in conflict with WHO, try decide terms which could be optimized to proofreading and translators, suggest not abbreviation to //sex//@eng-Latn. But there are some additional comments:
The
/biological sex/
@eng-Latn definition could be focused on "current/biological sex/
" (which allow divideCPV1.00gender
on/gender identity/
and/biological sex/
). Some comments:/assigned sex at birth/
@eng-Latn (which usage notes strictly to medical proposes) seems to be aligned not only with healthcare, but also with feedback from non-binary experts since/biological sex/
@eng-Latn can change. But usage notes or/assigned sex at birth/
@eng-Latn could mention that this field alone is insuficient to make automated inferences (like need of exams on periodic intervals or block request from operations when the person do not have organ affected by that); for that a/organ inventory/
(which could be out of scope of CPV) would be need to know differences common to general population from/assigned sex at birth/
which is quite relevant on emergency response and/or patients who don't remember (this avoided unnecessary circurgies)/biological sex/
need to have on it's definitions "cromossomes". If necessary this could be a different field and potentially this need could be covered with/assigned sex at birth/
@eng-LatnThe submitter is aware that can be more variations of
/biological sex/
, but note that they are likely to have more reusable values between themselves than/biological sex/
vs/gender identity/
. So for CPV2.00 breaking in two concepts is already a huge step.Proposal annex-3: intentional NO direct replacement for
CPV1.00gender
The submitter doubts any real need to have a single term for concepts which are likely to cause confusion. For example:
/gender identity/
"/biological sex/
"/gender identity/
" and "/biological sex/
" (different concepts, but are not same term)/gender identity/
" or "/biological sex/
" (different concepts, but are not same term)/gender identity and biological sex/
" (single concept)/gender identity or biological sex/
" (single concept)Two reasons for not having a single concept for both concepts:
But what if it is really necessary to have a generic data field concept? For example, old data, which is not clear if was
/gender identity/
or/biological sex/
? For worst case scenarios, references like HL7_GENDER_R1_I1_2021JAN use "Recorded Sex or Gender" and even then strongly discourage keep update such fields for new data. Even if CPV2.00 implements such a feature, not only it's better not "be automatically" upgraded (for example: implementers may look at their current data and detect that it can be mapped to new fields) but also discourage such usage.Note that the main point here is that no direct replacement has a valid reason.