microbiomedata / nmdc-ontology

Creative Commons Zero v1.0 Universal
0 stars 0 forks source link

prepare changesheets for problematic triads in MongoDB #66

Open turbomam opened 4 months ago

turbomam commented 4 months ago

possible dupe issue from another repo?

problematic_env_broad_scale_term_ids:
- label: meadow ecosystem
  term: ENVO:00000108
- label: agriculture
  term: ENVO:01001442
problematic_env_medium_rm_ids:
- label: null
  term: FMA:14541
- label: plant-associated environment
  term: ENVO:01001001

Leave as is

env_broad_scale_term_ids:
- label: animal-associated environment
  term: ENVO:01001002
env_medium_rm_ids:
- label: portion of plant tissue
  term: PO:0009007
- label: leaf
  term: PO:0025034

turbomam commented 4 months ago
PREFIX FMA: <http://purl.obolibrary.org/obo/FMA_>
PREFIX nmdc: <https://w3id.org/nmdc/>
select 
distinct ?bs 
where {
    ?bs ?p ?node .
    ?node nmdc:term FMA:14541 .
}

http://3.236.215.220/sparql?name=&infer=true&sameAs=true&query=PREFIX%20nmdc%3A%20%3Chttps%3A%2F%2Fw3id.org%2Fnmdc%2F%3E%0Adescribe%20nmdc%3Absm-11-05fh0s26

https://api.microbiomedata.org/nmdcschema/ids/nmdc%3Absm-11-05fh0s26

turbomam commented 4 months ago
PREFIX FMA: <http://purl.obolibrary.org/obo/FMA_>
PREFIX nmdc: <https://w3id.org/nmdc/>
select 
distinct ?p
where {
    ?bs ?p ?node .
    ?node nmdc:term FMA:14541 .
}
turbomam commented 4 months ago
turbomam commented 4 months ago
{
  "id": "nmdc:bsm-11-05fh0s26",
  "name": "Mouse - antibiotic treatment, 5-ASA treatment, C17 treatment (HS177_41)",
  "part_of": [
    "nmdc:sty-11-hdd4bf83"
  ],
  "env_broad_scale": {
    "has_raw_value": "Animal-associated environment [ENVO:01001002]",
    "term": {
      "id": "ENVO:01001002",
      "name": "Animal-associated environment"
    }
  },
  "env_local_scale": {
    "has_raw_value": "mouse cecal contents [FMA:14541]",
    "term": {
      "id": "FMA:14541",
      "name": "mouse cecal contents"
    }
  },
  "env_medium": {
    "has_raw_value": "mouse cecal contents [FMA:14541]",
    "term": {
      "id": "FMA:14541",
      "name": "mouse cecal contents"
    }
  },
  "samp_name": "Mouse - antibiotic treatment, 5-ASA treatment, C17 treatment (HS177_41)",
  "collection_date": {
    "has_raw_value": "2021-11-05"
  },
  "depth": {
    "has_raw_value": "0",
    "has_numeric_value": 0
  },
  "elev": 23.1648,
  "env_package": {
    "has_raw_value": "Host-associated"
  },
  "experimental_factor": {
    "has_raw_value": "antibiotic treatment, 5-ASA treatment, C17 treatment"
  },
  "geo_loc_name": {
    "has_raw_value": "USA: California, Davis"
  },
  "gravidity": {
    "has_raw_value": "no"
  },
  "host_age": {
    "has_raw_value": "64 days",
    "has_numeric_value": 64,
    "has_unit": "days"
  },
  "host_body_habitat": {
    "has_raw_value": "gastrointestinal tract"
  },
  "host_body_product": {
    "has_raw_value": "cecal contents [FMA:14541]",
    "term": {
      "id": "FMA:14541",
      "name": "cecal contents"
    }
  },
  "host_body_site": {
    "has_raw_value": "cecum [UBERON:0001153]",
    "term": {
      "id": "UBERON:0001153",
      "name": "cecum"
    }
  },
  "host_common_name": {
    "has_raw_value": "Mouse"
  },
  "host_diet": [
    {
      "has_raw_value": "Teklad Diet #TD06415 high fat"
    }
  ],
  "host_genotype": {
    "has_raw_value": "Swiss Webster"
  },
  "host_life_stage": {
    "has_raw_value": "adult"
  },
  "host_sex": "female",
  "lat_lon": {
    "has_raw_value": "38.5382 -121.7617",
    "latitude": 38.5382,
    "longitude": -121.7617
  },
  "perturbation": [
    {
      "has_raw_value": "antibiotic-treated FMT, C17"
    }
  ],
  "source_mat_id": {
    "has_raw_value": "UUID:047aacd5-857f-3e9e-b782-e70a0fc5a0ca"
  },
  "analysis_type": [
    "metagenomics"
  ]
}
turbomam commented 4 months ago

Add a taxon/species/strain identifier?

turbomam commented 4 months ago
  "host_body_site": {
    "has_raw_value": "cecum [UBERON:0001153]",
    "term": {
      "id": "UBERON:0001153",
      "name": "cecum"
    }
  }

Fix spelling, at least in name

turbomam commented 4 months ago

list of term usage

PREFIX nmdc: <https://w3id.org/nmdc/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?p ?term ?label ?name (count(?s1) as ?count)
where {
    graph <https://api.microbiomedata.org> {
        ?s1 ?p ?s2 .
        ?s2 nmdc:term ?term .
        optional {
            ?term nmdc:name ?name 
        }
    }
    graph <http://purl.obolibrary.org/obo/nmdco.owl> {
        optional {
            ?term rdfs:label ?label 
        }
    }
}
group by ?p ?term ?label ?name
order by ?p ?term ?label ?name
turbomam commented 4 months ago
term label name count
CHEBI:00000   cytopoint 6
CHEBI:00000   maropitant 6
CHEBI:15414   S-adenosyl-L-methionine 4
CHEBI:176843   vitamin B12 2
CHEBI:176915   clindamycin 3
CHEBI:2676   clavamox 5
CHEBI:28177   theophylline 2
CHEBI:28971   ampicillin 1
CHEBI:3011   benazapril 2
CHEBI:32003   pimobendan 2
CHEBI:3504   simplicef 1
CHEBI:3534   cephalexin 1
CHEBI:35720   enrofloxacin 2
CHEBI:364453   carprofen 4
CHEBI:4031   cyclosporine 2
CHEBI:42797   gabapentin 2
CHEBI:50270   pantoprazole 2
CHEBI:50845   doxycycline 1
CHEBI:6437   levetiracetam 2
CHEBI:64482   phosphatidylcholine 4
CHEBI:6775   5-ASA 8
CHEBI:6909   metronidazole 1
CHEBI:7772   omperazole 2
CHEBI:7773   ondansetron 6
CHEBI:7959   d-penicillamine 2
CHEBI:8378   prednisolone 4
CHEBI:8382   prednisone 6
CHEBI:9144   silibinin 4
CHEBI:9241   spironolactone 2
CHEBI:9321   sulbactam 1
CHEBI:9654   trazadone 2
CHEBI:9907   ursodiol 4
ENVO:00000011 garden garden 113
ENVO:00000011 garden garden 113
ENVO:00000021 freshwater lake freshwater lake 93
ENVO:00000022 river river 452
ENVO:00000044 peatland   83
ENVO:00000083 hill hill 4
ENVO:00000086 plain plain 4
ENVO:00000094 volcanic feature   4
ENVO:00000100 valley valley 12
ENVO:00000108 meadow ecosystem   53
ENVO:00000114 agricultural field agricultural field 12
ENVO:00000119 planted forest   4
ENVO:00000120 oil palm plantation   6
ENVO:00000128 dry valley dry valley 2
ENVO:00000148 riffle riffle 243
ENVO:00000161 banana plantation   3
ENVO:00000170 dune   1
ENVO:00000182 plateau plateau 4
ENVO:00000291 drainage basin   10
ENVO:00000292 watershed   53
ENVO:00000305 peninsula peninsula 4
ENVO:00000305 peninsula peninsula 4
ENVO:00000376 biosphere reserve   60
ENVO:00000384 river bed   33
ENVO:00000444 woodland clearing woodland clearing 4
ENVO:00000446 terrestrial biome terrestrial biome 5606
ENVO:00000516 hummock hummock 38
ENVO:00000548 gravel field   1
ENVO:00000873 freshwater biome   38
ENVO:00001998 soil soil 5294
ENVO:00002000 slope slope 14
ENVO:00002003 fecal material fecal material 151
ENVO:00002003 fecal material feces material 151
ENVO:00002007 sediment sediment 165
ENVO:00002011 fresh water   1
ENVO:00002042 surface water surface water 287
ENVO:00002130 hypolimnion hypolimnion 4
ENVO:00002131 epilimnion epilimnion 398
ENVO:00002132 metalimnion metalimnion 3
ENVO:00002194 oil field production water   17
ENVO:00002204 anthropogenic contamination feature   10
ENVO:00002258 loam   2
ENVO:00002259 agricultural soil agricultural soil 12
ENVO:00002261 forest soil forest soil 22
ENVO:00002261 forest soil forest soil 22
ENVO:00002269 thermocline thermocline 10
ENVO:00003074 manufactured product   1
ENVO:00003082 enriched soil   2
ENVO:00005741 alpine soil   6
ENVO:00005750 grassland soil grassland soil 64
ENVO:00005760 burned soil burned soil 2
ENVO:00005761 meadow soil meadow soil 10
ENVO:00005773 pasture soil pasture soil 2
ENVO:00005774 peat soil   118
ENVO:00005778 tropical soil   105
ENVO:00005784 spruce forest soil   17
ENVO:00005800 desert sand   1
ENVO:00005801 rhizosphere rhizosphere 118
ENVO:00005802 bulk soil bulk soil 172
ENVO:01000017 sand   52
ENVO:01000018 gravel   1
ENVO:01000174 forest biome   205
ENVO:01000177 grassland biome   60
ENVO:01000179 desert biome   6
ENVO:01000183 tropical desert biome   2
ENVO:01000185 montane desert biome   6
ENVO:01000189 temperate savanna biome temperate savanna biome 4
ENVO:01000211 temperate coniferous forest biome   6
ENVO:01000215 temperate shrubland biome temperate shrubland biome 8
ENVO:01000216 montane shrubland biome montane shrubland biome 6
ENVO:01000219 anthropogenic terrestrial biome anthropogenic terrestrial biome 107
ENVO:01000221 temperate woodland biome temperate woodland biome 29
ENVO:01000221 temperate woodland biome temperate woodland biome 29
ENVO:01000245 cropland biome   9
ENVO:01000247 rangeland biome rangeland biome 8
ENVO:01000249 urban biome urban biome 15
ENVO:01000250 subpolar coniferous forest biome   31
ENVO:01000252 freshwater lake biome freshwater lake biome 560
ENVO:01000253 freshwater river biome freshwater river biome 1040
ENVO:01000297 freshwater river freshwater river 13
ENVO:01000349 root matter root matter 101
ENVO:01000352 field field 136
ENVO:01000355 vegetation layer   18
ENVO:01000409 freshwater littoral zone freshwater littoral zone 34
ENVO:01000599 river water river water 13
ENVO:01000621 microcosm   17
ENVO:01000687 coast   2
ENVO:01000816 area of deciduous forest area of deciduous forest 1069
ENVO:01000843 area of evergreen forest area of evergreen forest 1268
ENVO:01000855 area of mixed forest area of mixed forest 199
ENVO:01000861 area of dwarf scrub area of dwarf scrub 295
ENVO:01000869 area of scrub area of scrub 614
ENVO:01000887 area of sedge- and forb-dominated herbaceous vegetation area of sedge- and forb-dominated herbaceous vegetation 123
ENVO:01000888 area of gramanoid or herbaceous vegetation area of gramanoid or herbaceous vegetation 777
ENVO:01000891 area of pastureland or hayfields area of pastureland or hayfields 169
ENVO:01000892 area of cropland area of cropland 207
ENVO:01000893 area of woody wetland area of woody wetland 245
ENVO:01000894 area of emergent herbaceous wetland area of emergent herbaceous wetland 45
ENVO:01000941 planetary subsurface zone   17
ENVO:01001001 plant-associated environment plant-associated biome 192
ENVO:01001002 animal-associated environment Animal-associated environment 156
ENVO:01001002 animal-associated environment animal-associated environment 156
ENVO:01001057 environment associated with a plant part or small plant environment associated with a plant part or small plant 199
ENVO:01001191 water surface water surface 18
ENVO:01001209 wetland ecosystem   2
ENVO:01001370 tundra ecosystem tundra ecosystem 12
ENVO:01001442 agriculture agricultural biome 384
ENVO:01001442 agriculture phyllosphere biome 384
ENVO:01001616 bare soil bare soil 10
ENVO:01001803 tropical forest   105
ENVO:01001837 subalpine biome subalpine biome 18
ENVO:01001841 volcanic soil   4
ENVO:01001869 fracking liquid   2
ENVO:02000059 surface soil   13
ENVO:02500027 anthropogenic environmental process   2
ENVO:03600094 stream pool stream pool 95
ENVO:03600095 stream run stream run 191
ENVO:03600096 step pool step pool 31
ENVO:03605001 epilithon epilithon 371
ENVO:03605002 epipelon epipelon 27
ENVO:03605003 epiphyton epiphyton 25
ENVO:03605004 epipsammon epipsammon 100
ENVO:03605005 epixylon epixylon 37
ENVO:03605006 stream water stream water 104
ENVO:03605007 freshwater stream freshwater stream 104
ENVO:03605008 freshwater stream biome freshwater stream biome 104
ENVO:04000007 lake water lake water 560
FMA:14541   cecal contents 99
FMA:14541   mouse cecal contents 99
NCBITaxon:1118232     101
NCBITaxon:1647806     471
NCBITaxon:1861841     94
NCBITaxon:256318   metagenome 87
NCBITaxon:2809082     1
NCBITaxon:3689     306
NCBITaxon:410658   soil metagenome 363
NCBITaxon:449393     245
NCBITaxon:556182     60
NCBITaxon:939928     108
NCBITaxon:9925     95
PO:0009007 portion of plant tissue   10
PO:0025025 root system   10
PO:0025034 leaf leaf 98
UBERON:0000059 large intestine large intestine 28
UBERON:0001153 caecum cecum 33
UBERON:0001555 digestive tract digestive tract 95
UBERON:0001988 feces Feces 28
turbomam commented 4 months ago

A changesheet has been submitted to use Uberon terms for the env_local_scale and env_medium for the Biosmaples relevant to this issue. The FMA host_body_product assertions have been removed.