Closed jagadishcs closed 2 years ago
@cmungall @pbuttigieg @TBKReddy
To create feature/Local environmental context for samples from host organisms:
Would it be a good idea to have the MIxS/EnvO triad for a leaf biosample from a plant, Brachypodium distachyon:
biome/Broad-scale environmental context: Plant-associated biome (not yet created in the EnvO, that is yet to be decided) feature/Local environmental context: commelinids (NCBITaxon_4734) (closest lineage level available in the EnvO for Brachypodium distachyon) material/Environmental medium: leaf (PO_0025034)
Feature: the closest lineage of the plant make sense for a feature/Local environmental context, since it meets the definition of environmental feature: "environmental features that are in the vicinity of and have a strong casual influence on the entity" (Pier Luigi Buttigieg et al. 2016).
Having the closest taxonomic lineage of the host plant at the local environmental context for a plant leaf biosample would be comparable to that of other environmental feature like freshwater river, lake, pond when the sample is water;
This is just a possibility and trying to see with this one example how best we can have the MIxS/EnvO triad for samples from host organisms.
Thanks
@cmungall @wdduncan @pbuttigieg @TBKReddy
Is it possible to discuss and decide about creating the host-associated biomes since this will help us to assign the EnvO terms for about 6K plant-associated biosamples?
The EnvO already has 'environment associated with a plant part or small plant' (ENVO_01001057) and 'environment associated with an animal part or small animal' (ENVO_01001055) under ecosystem. Therefore, I believe, it should be appropriate and useful to create the following terms as biome/broad-scale environmental context in the EnvO:
Host-associated biome Animal-associated biome Human-associated biome Plant-associated biome
Thank you
Hi @jagadishcs
@cmungall @kaiiam @wdduncan and I discussed this in our monthly ENVO editors' call.
We'll update the MIxS annotation wiki page with some guidance to address your questions. Check in there in a couple of hours.
If we can't answer something there, we'll post here too.
Hi @jagadishcs
The wiki page noted above has been updated with guidance for host-associated microbial samples
Host-associated biome Animal-associated biome Human-associated biome Plant-associated biome
We wouldn't really create terms such as these unless we're referring to the entire microbiome of a given organism. The approach suggested in the wiki - and the use of the MIxS host metadata fields for taxonomy - should get you the information such terms would provide (and more).
Thanks @pbuttigieg for your response. @cmungall @TBKReddy
For me, it is difficult to get convinced about the rule of 'env_broad_scale' for the host-associated biosamples; now it has been given as "entries should reflect the ecosystem the host is found in (e.g. an urban biome [ENVO:01000249] or a tundra biome [ENVO:01000180])"
I am unable to get convinced with this rule and let me explain the reason with an example: if a human gut is the biosample from an individual living in an an urban area, then, assigning the biosample 'urban biome' Vs when an individual from a village, then assigning the biosample 'village biome' as its env_broad_scale do not add useful value; the urban biome for a gut biosample (or for a leaf biosample taken from a tree from an urban area) does not meet the definition of biome (Pier Luigi Buttigieg et al., 2013); this issue is applicable to any plants or animals.
Therefore, creation of a few 'host-associated' terms as env_broad_scale/biomes in the EnvO for biosamples that are originated from host organisms would be useful.
The following suggested terms will not compete with MIxS host metadata fields for taxonomy but meets the EnvO definition for biome and the MIxS definition for broad-scale environmental context. Host-associated biome Animal-associated biome Human-associated biome Plant-associated biome
[...] if a human gut is the biosample from an individual living in an an urban area, then, assigning the biosample 'urban biome' Vs when an individual from a village, then assigning the biosample 'village biome' as its env_broad_scale do not add useful value;
For clarity/precision, the human gut is not the sample: a portion of [tissue,mucus,...] from the human gut is the sample.
The broad scale environment of the host adds context on what that host is likely to be exposed to, which will affect its various microbiomes. The skin and gut microbiome(s) of an organism living in the desert will vary from that of an organism from the same taxon living in, e.g. a forest.
Humans are a bit of a special case as the built environment changes a lot of things, but the density of settlements (and the implicit services available) can help an initial search (e.g. to compare the microbiomes on the hands of urban vs village dwellers. There are of course more precise metadata that should be used too (diet profiles, etc), but this is about the right level for the env_broad_scale field.
the urban biome for a gut biosample (or for a leaf biosample taken from a tree from an urban area) does not meet the definition of biome (Pier Luigi Buttigieg et al., 2013); this issue is applicable to any plants or animals.
I can't really follow the argument above.
Therefore, creation of a few 'host-associated' terms as env_broad_scale/biomes in the EnvO for biosamples that are originated from host organisms would be useful.
I'm not sure this is true - what more do they bring relative to annotating the anatomical site + using the MIxS taxon/host fields? Do you have an example?
The following suggested terms will not compete with MIxS host metadata fields for taxonomy but meets the EnvO definition for biome and the MIxS definition for broad-scale environmental context Host-associated biome Animal-associated biome Human-associated biome Plant-associated biome
Hmm, I don't really agree. I see these in direct competition with/redundant with the taxon and host information in MIxS and I can't really see what more they bring.
I'm not sure what you mean with compliance to the ENVO definition of biome here. The microbiome is also embedded in the biome the host is embedded in. It's an order removed perhaps, but still contextually accurate.
Just a suggestion/observation:
Perhaps there is some confusion about the intent of the env_broad/local/medium terms.
@wdduncan quite likely - do you have a suggestion on how we can resolve it or where the core of the confusion is?
Out of interest, I looked at the the top values use in env_broad_scale
for the host-associated package in INSDC via NCBI BioSample. The complete list is in https://github.com/INCATools/biosample-analysis/commit/ab953a44083d18c91465867f5aaa819034ea4948
These are the top N. As can be seen it's pretty ad-hoc and all over the place!
count | value |
---|---|
513052 | |
19529 | not applicable |
19250 | urban biome |
5693 | coral reef |
5237 | gut |
4347 | missing |
4172 | marine biome |
3090 | lower digestive tract |
3088 | host-associated |
2718 | dense settlement biome |
2639 | mouse |
2623 | chicken intestine |
2543 | marine benthic biome |
2505 | mouse gut |
2439 | anthropogenic terrestrial biome |
2435 | not collected |
2341 | temperate biome |
2304 | farm |
2249 | NA |
2127 | intestine environment (ENVO:2100002) |
2119 | freshwater biome |
2077 | Mouse gut |
2054 | forest |
1999 | large river biome |
1647 | Gut |
1634 | Bos taurus taurus rumen microbiome |
1506 | host-associated habitat |
1490 | ENVO:01000219 |
1476 | fecal material |
1461 | terrestrial biome ENVO:00000446 |
1432 | gut microbiome |
1416 | anatomical entity environment |
1383 | research facility |
1316 | shrubland biome |
1296 | temperate forest biome |
1285 | mammalia-associated habitat |
1221 | Gut microbiome |
1172 | ocean biome |
1140 | ENVO:animal-associated habitat |
1122 | ENVO:00009002 |
1102 | feces |
1091 | grassland biome |
1066 | savana |
1065 | Human-associated habitat |
1051 | animal-associated environment |
1030 | animal cage ENVO:01000922 |
1030 | estuarine biome |
1012 | Rumen microbiome |
959 | terrestrial biome |
952 | Laboratory |
931 | N/A |
917 | subpolar coniferous forest biome |
899 | feces metagenome |
886 | intestine environment |
841 | animal distal gut |
830 | rangeland biome |
824 | chicken gut |
803 | mouse gut microbiome |
763 | Forest |
757 | tropical grassland biome |
746 | terresterial biome |
727 | stream biome |
725 | laboratory |
724 | fish gut biome |
720 | animal-associated environment [ENVO:01001002] |
644 | [ENVO:01000049] |
633 | Feces |
when we further filter for human/9606 as host:
count | value |
---|---|
1118 | dense settlement biome |
1074 | ENVO:01000219 |
863 | urban biome |
114 | Human-associated habitat |
44 | not applicable |
43 | host-associated |
29 | airways |
12 | village biome |
8 | N/A |
5 | Wound |
3 | |
2 | temperate |
1 | anatomical entity environment |
1 | Human periodontal pocket |
1 | Metagenomic RNA-seq |
1 | Human eye |
1 | Human cerebrospinal fluid |
Thanks @cmungall - this shows the need for better documentation and outreach from the GSC and us on better annotations.
@ramonawalls for the GSC CIG - this shows the need for validation, so many of these have nothing to do with ENVO or any other controlled vocabs. The INSDC has never invested in validation, and this is the mess we get in return. Systems and brokerages like GFBIO help a great deal here @ikostadi
@cmungall @pbuttigieg @TBKReddy
1) What is the biome/broad-scale environmental context in EnvO for a sample(environmental medium) that comes from a host organism? When a biosample is from a host, plant/animal/human, can we have new biome in the EnvO - as host-associated or individually plant/animal/human - associated; my concern is that if the material is host-derived, then for the microbial community the biome is the host organism. Therefore, the following broad-scale environmental context EnvO terms are required for the sample that comes from host:
Host-associated biome Animal-associated biome Human-associated biome Plant-associated biome (please refer the third point for alternatives)
2) What will be the feature/local environmental context in EnvO for a sample that comes from a host organism? Let me give a simple example to contextualize the question: if the material is water, then, the feature/local environmental context can be freshwater river, lake, pond, etc. But, if the biosample is from a host organism, say, leaf, what would be the appropriate the feature/local environmental context EnvO term?
3) Alternatives are: When the biosample is from any host organism, we may also consider to have the biome as terrestrial or aquatic for applicable plants & animals, terrestrial biome to human (Basically to indicate from where the host organism comes from) and Feature can be plant/animal/human/host -associated.
I prefer the first option, the 'host-associated biome' to the EnvO for biosamples that comes from host organism but contain the microbial communities.