Open jjkoehorst opened 4 weeks ago
I agree it's not optimal.
The side issue I brought up on slack was that gender != biological/physical sex (this is something that upset a lot of people in the ancient DNA community).
I wonder if this has partly lead to the slightly odd distinction between host and animal... the host sex says it's an enumeration but there is no enumeration defined: https://genomicsstandardsconsortium.github.io/mixs/0000811/, so it's hard to me see- but I wonder if self-defined gender options are allowed there, whereas the animal sex has a strict enumeration. However it looks rather mammalian focused?
It's a good question, but I imagine it may be tricky to come up with a common consensus - particularly to try and unify them all.
If you unfold the linkml blob at the end it might give some clarification
But then there should be a distinction between biological sex
and gender
and solve that particular issue that way?
name: host_sex
annotations:
Expected_value:
tag: Expected_value
value: enumeration
description: Gender or physical sex of the host
title: host sex
comments:
- example of non-binary from Excel sheets does not match any of the enumerated values
from_schema: https://w3id.org/mixs
keywords:
- host
- host.
string_serialization: '[female|hermaphrodite|non-binary|male|transgender|transgender
(female to male)|transgender (male to female)
|undeclared]'
slot_uri: MIXS:0000811
alias: host_sex
domain_of:
- HostAssociated
- HumanAssociated
- HumanGut
- HumanOral
- HumanSkin
- HumanVaginal
range: string
Aha, I didn't see that Indeed! I expected a dedicated enum object like in the animal one...
But yes, exactly. That's what the aDNA community will eventually propose
Anywhere one sees a string_serialization
in the schema, it's just becasuse I didn't get around to doing the conversion to an enumeration. In this case its becasuse there was a time when permissible values containing punctuation weren't serialized in OWL very well. I think that's a thing of the past now.
There are also some terms with string_serialization
annotations because one or ore of the permissible values contains something that looks like a variable like
'steak|burrito|soup|pizza (N pieces)'
I have no idea how to get those out of string_serialization
hell while preserving the implied expectation of expressiveness.
I really appreciate it when you guys find bugs, inconsistencies etc. in the MIxS schema. I could use some help planning and applying the fixes.
I really appreciate it when you guys find bugs, inconsistencies etc. in the MIxS schema. I could use some help planning and applying the fixes.
Gonna make a separate issue for this so we don't derail the topic!
I would like to start a discussion on the host sex term. As mentioned in slack
Now this started a discussion on slack (@mslarae13 @only1chunts , @jfy133 , @Woolly-at-EBI ) feel free to post your comment if you would like to.
In summary, there is a term for animal sex as part of the https://genomicsstandardsconsortium.github.io/mixs/0016019/ package There is a term for host sex in human / host associated packages ( we actually use host associated mostly for non-human studies but I guess we are wrong here?)
Maybe a silly question? But why make the distinction? I know there are some transgender terms but I think it might be fine if we merge this with all the possible sex related terms? Animal (Human/and the rest) / Plant ?