GenomicsStandardsConsortium / mixs

Minimum Information about any (X) Sequence” (MIxS) specification
https://w3id.org/mixs
Creative Commons Zero v1.0 Universal
36 stars 21 forks source link

Human / Host / Animal sex (discussion) #838

Open jjkoehorst opened 4 weeks ago

jjkoehorst commented 4 weeks ago

I would like to start a discussion on the host sex term. As mentioned in slack

We are currently working on a pig dataset with a lot of castrated entries. We are currently using the FAIR Data Station https://fairds.fairbydesign.nl/terms which uses most of the terms from the ENA checklists and the host sex is the only term available related to the sex of the animal. I see in mixs there is animal sex but this is not synced with ENA. What would be a good way to continue? Could we expand host_sex in mixs to include castrated and other terms or should we try to convince ENA to include animal sex? although the host-associated package https://genomicsstandardsconsortium.github.io/mixs/0016002/ does focus on host_sex and not animal sex. Any ideas?

Now this started a discussion on slack (@mslarae13 @only1chunts , @jfy133 , @Woolly-at-EBI ) feel free to post your comment if you would like to.

In summary, there is a term for animal sex as part of the https://genomicsstandardsconsortium.github.io/mixs/0016019/ package There is a term for host sex in human / host associated packages ( we actually use host associated mostly for non-human studies but I guess we are wrong here?)

Maybe a silly question? But why make the distinction? I know there are some transgender terms but I think it might be fine if we merge this with all the possible sex related terms? Animal (Human/and the rest) / Plant ?

jfy133 commented 4 weeks ago

I agree it's not optimal.

The side issue I brought up on slack was that gender != biological/physical sex (this is something that upset a lot of people in the ancient DNA community).

I wonder if this has partly lead to the slightly odd distinction between host and animal... the host sex says it's an enumeration but there is no enumeration defined: https://genomicsstandardsconsortium.github.io/mixs/0000811/, so it's hard to me see- but I wonder if self-defined gender options are allowed there, whereas the animal sex has a strict enumeration. However it looks rather mammalian focused?

It's a good question, but I imagine it may be tricky to come up with a common consensus - particularly to try and unify them all.

jjkoehorst commented 4 weeks ago

If you unfold the linkml blob at the end it might give some clarification

But then there should be a distinction between biological sex and gender and solve that particular issue that way?

name: host_sex
annotations:
  Expected_value:
    tag: Expected_value
    value: enumeration
description: Gender or physical sex of the host
title: host sex
comments:
- example of non-binary from Excel sheets does not match any of the enumerated values
from_schema: https://w3id.org/mixs
keywords:
- host
- host.
string_serialization: '[female|hermaphrodite|non-binary|male|transgender|transgender
  (female to male)|transgender (male to female)

  |undeclared]'
slot_uri: MIXS:0000811
alias: host_sex
domain_of:
- HostAssociated
- HumanAssociated
- HumanGut
- HumanOral
- HumanSkin
- HumanVaginal
range: string
jfy133 commented 4 weeks ago

Aha, I didn't see that Indeed! I expected a dedicated enum object like in the animal one...

But yes, exactly. That's what the aDNA community will eventually propose

turbomam commented 4 weeks ago

Anywhere one sees a string_serialization in the schema, it's just becasuse I didn't get around to doing the conversion to an enumeration. In this case its becasuse there was a time when permissible values containing punctuation weren't serialized in OWL very well. I think that's a thing of the past now.

There are also some terms with string_serialization annotations because one or ore of the permissible values contains something that looks like a variable like

'steak|burrito|soup|pizza (N pieces)'

I have no idea how to get those out of string_serialization hell while preserving the implied expectation of expressiveness.

turbomam commented 4 weeks ago

I really appreciate it when you guys find bugs, inconsistencies etc. in the MIxS schema. I could use some help planning and applying the fixes.

jfy133 commented 4 weeks ago

I really appreciate it when you guys find bugs, inconsistencies etc. in the MIxS schema. I could use some help planning and applying the fixes.

Gonna make a separate issue for this so we don't derail the topic!

turbomam commented 4 weeks ago

see also