GenomicsStandardsConsortium / mixs

Minimum Information about any (X) Sequence” (MIxS) specification
https://w3id.org/mixs
Creative Commons Zero v1.0 Universal
33 stars 20 forks source link

Ethnicity #59

Closed only1chunts closed 3 years ago

only1chunts commented 4 years ago

We have an item called "IHMC ethnicity" MIXS:0000895 Definition = Ethnicity of the subject Syntax = IHMC code or free text I've been searching the web for a "IHMC" list of codes, but cannot find it. If we expect people to use a specific controlled vocabulary we need to provide a way to find it. Whilst looking at this term do we want to rethink the preferred CV we suggest? is there one in an active ontology that would be more accessible? Maybe the subset of "Race" within NCIT: http://purl.obolibrary.org/obo/NCIT_C17049 ?

lschriml commented 4 years ago

Tracking down the origin of IHMC ethnicity. --> International Human Microbiome Consortium, IHMC https://www.hmpdacc.org/ihmp/ HMP catalog: https://www.hmpdacc.org/hmp/catalog/

         -- I could not find the IHMC list of ethnic groups.

I agree we should update these fields: - and to include a world perspective

Noting here a few options to explore. -- Would be useful to define ethnicity and race in MIxS.

OLS ontologies that include ethnic groups/race:

Ethnicity: https://en.wikipedia.org/wiki/Ethnic_group An ethnic group or ethnicity is a category of people who identify with each other, usually on the basis of presumed similarities such as a common language, ancestry, history, society, culture, nation or social treatment within their residing area

Race: https://en.wikipedia.org/wiki/Race_(human_categorization) A race is a grouping of humans based on shared physical or social qualities into categories generally viewed as distinct by society.

List of ethnic groups - UK https://www.ethnicity-facts-figures.service.gov.uk/style-guide/ethnic-groups

White English / Welsh / Scottish / Northern Irish / British Irish Gypsy or Irish Traveller Any other White background

Mixed / Multiple ethnic groups White and Black Caribbean White and Black African White and Asian Any other Mixed / Multiple ethnic background

Asian / Asian British Indian Pakistani Bangladeshi Chinese Any other Asian background

Black / African / Caribbean / Black British African Caribbean Any other Black / African / Caribbean background

Other ethnic group Arab Any other ethnic group

 In Wales, the first option in the White broad category is changed so that Welsh appears first in the list, 
followed by English, Scottish, Northern Irish and British.

Definitions for Racial and Ethnic Categories from : The Revisions to OMB Directive 15 defines each racial and ethnic category as follows:

from NIH: https://grants.nih.gov/grants/guide/notice-files/not-od-15-089.html American Indian or Alaska Native. A person having origins in any of the original peoples of North and South America (including Central America), and who maintains tribal affiliation or community attachment.

Asian. A person having origins in any of the original peoples of the Far East, Southeast Asia, or the Indian subcontinent including, for example, Cambodia, China, India, Japan, Korea, Malaysia, Pakistan, the Philippine Islands, Thailand, and Vietnam.

Black or African American. A person having origins in any of the black racial groups of Africa. Terms such as "Haitian" or "Negro" can be used in addition to "Black or African American."

Hispanic or Latino. A person of Cuban, Mexican, Puerto Rican, South or Central American, or other Spanish culture or origin, regardless of race. The term, "Spanish origin," can be used in addition to "Hispanic or Latino."

Native Hawaiian or Other Pacific Islander. A person having origins in any of the original peoples of Hawaii, Guam, Samoa, or other Pacific Islands.

White. A person having origins in any of the original peoples of Europe, the Middle East, or North Africa.

Notes: Looked at first MIXS paper, in the supplementary file, we had IHMC medication code and IHMC ethnicity.

Google search for IHMC medication code, - one link was to TDWG page for MIxS: https://terms.tdwg.org/wiki/mixs:ihmc_medication_code and. https://terminology-sandbox.biowikifarm.net/wiki/mixs:ihmc_medication_code https://www.ncbi.nlm.nih.gov/biosample/docs/attributes/

BioPortal, MIXS controlled vocabulary - published in 2012, has OBO foundry purls http://bioportal.bioontology.org/ontologies/MIXSCV/?p=summary

-- Noting cites here, as we could provide these links
    And they may need updating after versions are released
cmungall commented 4 years ago

Have you looked at hANCESTRO?

big problems with the root class name in NCIT

On Fri, Jun 26, 2020, 01:14 Chris Hunter notifications@github.com wrote:

We have an item called "IHMC ethnicity" Definition = Ethnicity of the subject Syntax = IHMC code or free text I've been searching the web for a "IHMC" list of codes, but cannot find it. If we expect people to use a specific controlled vocabulary we need to provide a way to find it. Whilst looking at this term do we want to rethink the preferred CV we suggest? is there one in an active ontology that would be more accessible? Maybe the subset of "Race" within NCIT: http://purl.obolibrary.org/obo/NCIT_C17049 ?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/GenomicsStandardsConsortium/mixs/issues/59, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOPWXV4XTDVQW3PQ4BLRYRKGBANCNFSM4OJDH4VA .

only1chunts commented 4 years ago

@lschriml are you suggesting it might be worthwhile splitting it into TWO fields, one for Ethnicity and one for Race, or just re-defining our ethnicity term to cover both aspects? I'm no expert here, but my preference would be for keeping 1 term to cover both aspects, my reason is that if we split it up there will be users making their own interpretations of the meanings for each and probably just picking one at random to include either/or/both values, and the historic values would still be able to be included. @cmungall I've not seen hANCESTRO before, but the "ancestry category HANCESTRO_0004" node looks perfect for use in MIxS and I think it would cover the entire range of things @lschriml brings up.

Currently the term in question looks like this:

Package item name: IHMC ethnicity
Structured comment name: ihmc_ethnicity
Definition: Ethnicity of the subject
Expected value: IHMC code or free text
Value syntax: {integer}\|{text}
Example: caucasian 
Preferred unit: NA

How do we go aout making changes to Item names? Would it best to just deprecate the above term and create a new one like below? How would that effect INSDC and their checklists? @josieburgin (can someone tag Ilene or Anji, I dont know their github names)

Package item name: Ethnicity or Race
Structured comment name: ethnicity_race
Definition: The ethnicity or race of the subject. An ethnic group or ethnicity is a category of people who identify with each other, usually on the basis of presumed similarities such as a common language, ancestry, history, society, culture, nation or social treatment within their residing area. A race is a grouping of humans based on shared physical or social qualities into categories generally viewed as distinct by society. 
Expected value: term from "ancestry category [HANCESTRO_0004](https://www.ebi.ac.uk/ols/ontologies/hancestro/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FHANCESTRO_0004)" 
Value syntax: {termLabel} {[termID]}
Example: African American [HANCESTRO:0568]
Preferred unit: NA
lschriml commented 3 years ago

Agreed, we need to update this term (icmc_ethnicity)

Discussion for CIG: (1) rename this term to: ethnicity OR 'ethnicity and race' (1) do we want to add a new term for 'race'

Looking in BioSample, I see the field as: 'ethnicity'

Cheers, Lynn

only1chunts commented 3 years ago

added to March 22nd CIG agenda.

anjijohnston commented 3 years ago

It looks like BioSample also includes "race" but currently there is no description. So it would not be a problem to have two terms: ethnicity and race

josieburgin commented 3 years ago

Hi all, I have raised this at the ENA and our concern is that, within Europe, 'race' is not a term often used and can be perceived as problematic:

https://en.wikipedia.org/wiki/Race_(human_categorization)#European_Union

Generally, the use of 'ethnicity' is more common. So within ENA, we would not plan to implement a term for 'race'. As MIxS is used globally, we would strongly discourage its use in the standards.

lschriml commented 3 years ago

Action Items:

josieburgin commented 3 years ago

Hi all, we checked with EMBL-EBI's EGA database (who work with personally identifiable nucleotide data). They say that 'ethnicity' is not included in their sample checklists so they don't have validation or a definition for this. Though if a submitter wants to provide it, they recommend to use a custom term called 'ethnicity' which is consistent with what we've proposed here.

lschriml commented 3 years ago

Proposed update: From: "IHMC ethnicity" Definition = Ethnicity of the subject Syntax = IHMC code or free text

To: Ethnicity: https://en.wikipedia.org/wiki/Ethnic_group A category of people who identify with each other, usually on the basis of presumed similarities such as a common language, ancestry, history, society, culture, nation or social treatment within their residing area.

https://en.wikipedia.org/wiki/List_of_contemporary_ethnic_groups

lschriml commented 3 years ago

hANCESTRO ontology: does not have a term for 'ethnicity' top node is: ancestry category http://purl.obolibrary.org/obo/HANCESTRO_0004 Population category defined using ancestry informative markers (AIMs) based on genetic/genomic data

lschriml commented 3 years ago

no comments, will implement. Lynn

only1chunts commented 3 years ago
Term name - Ethnicity
Structured comment name - Ethnicity
Definition - A category of people who identify with each other, usually on the basis of presumed similarities such as a common language, ancestry, history, society, culture, nation or social treatment within their residing area. https://en.wikipedia.org/wiki/List_of_contemporary_ethnic_groups
Expected value - text 9recomend from wikipedia list
Value syntax - {text}
lschriml commented 3 years ago

Updated for all packages, checked 'ethnicity' is not in any of the checklists. Closing the ticket. Cheers, Lynn