Closed aclum closed 8 months ago
@turbomam and I had some discussions with Hugh today, amc_fieldSuperParentcsv fields aquaticSiteType and namedLocation are the best fields to use to define the environmental terms. aquaticSiteType value of lake would map to freshwater lake biome aquaticSiteType value of river would map to freshwater river biome
Parsing the pattern in named location would help determine the environmental local scale term (CRAM.AOS.buoy.c1) buoy -> area of open water
figure 8 from NEON.DOC.003044.vE AOS Protocol and Procedure: AMC – Aquatic Microbial Sampling
B.3 Lakes and River Collection
so between a namedLocation
of CRAM.AOS.buoy.c1
and an aquaticSiteType
of Lake
we know the sample is from NEON site CRAM, which is an Aquatic Observation System (AOS), from a lake collected in open water (buoy) from a stratified lake at a depth of 0.5 meters (c1)
convention is $SITE.$SYSTEM.$LOCATION_W/IN_RIVER_OR_LAKE.$WATER_DEPTH_CODE
Notes from Hugh Here is a start for aquatic samples. For Broad scale, I believe it would be either “freshwater lake biome” [ENVO:01000252], or “freshwater river biome” [ENVO:01000253]. For surface samples, this would be derived from the “aquaticSiteType” field in the amc_fieldSuperParent table. For surface water medium scale, I can’t find anything better than “lake water”, [ENVO:04000007] and “river water”, [ENVO:01000599] For surface local scale, there is not much detail given as to habitat or part of river, so I suggest “freshwater river”, [ENVO:01000297] for river/stream local scale. For lakes, samples are collected either in littoral zone (near shore) or out in the deeper part of the lake. For these two the terms “freshwater littoral zone”, [ENVO:01000409] and “area of open water”, [ENVO:01000666] seem to fit. This distinction would have to be derived from the “namedLocation” field in the surface amc_fieldSuperParent table. Within the names of this field, it either contains ‘littoral’ or ‘buoy’, to designate where the sample was collected (for streams or rivers, this field is not useful).
fig 2 from NEON_cellCount_userGuide_vC
Appears to be active so will move to the next sprint.
local terms for lake surface water samples for lakes: c0 = surface (<0.5 m depth) = ‘water surface’ [ENVO:01001191]; c1 = ‘epilimnion’ [ENVO:00002131]; c2 = ‘thermocline’ [ENVO:00002269]; c3 = ‘hypolimnion’ [ENVO:00002130]; I suggest keeping littoral the same: “freshwater littoral zone”, [ENVO:01000409] Waiting for Hugh to confirm what to do for stratified rivers.
waiting for final confirmation from Hugh.
Hugh confirmed "freshwater river”, [ENVO:01000297] for env_local_scale for rivers
Proposed final set of rules, we'll need to port this to the assets csv. Asked Hugh about using multiple terms vs a single term for env_local_scale. NMDC currently doesn't support multiple env context terms.
if (DP1.20281.001 amc_fieldSuperParent aquaticSiteType lake) { then
DP1.20281.001 amc_fieldSuperParent aquaticSiteType lake Biosample env_broad_scale “freshwater lake biome” [ENVO:01000252] DP1.20281.001 amc_fieldSuperParent aquaticSiteType lake Biosample env_medium “lake water” [ENVO:04000007] DP1.20281.001 amc_fieldSuperParent namedLocation buoy.c0 Biosample env_local_scale ‘water surface’ [ENVO:01001191] DP1.20281.001 amc_fieldSuperParent namedLocation buoy.c1 Biosample env_local_scale ‘epilimnion’ [ENVO:00002131] DP1.20281.001 amc_fieldSuperParent namedLocation buoy.c2 Biosample env_local_scale ‘thermocline’ [ENVO:00002269] DP1.20281.001 amc_fieldSuperParent namedLocation buoy.c3 Biosample env_local_scale ‘hypolimnion’ [ENVO:00002130] DP1.20281.001 amc_fieldSuperParent namedLocation littoral Biosample env_local_scale “freshwater littoral zone” [ENVO:01000409]
} if (DP1.20281.001 amc_fieldSuperParent aquaticSiteType river) then {
DP1.20281.001 amc_fieldSuperParent aquaticSiteType river Biosample env_broad_scale “freshwater river biome” [ENVO:01000253] DP1.20281.001 amc_fieldSuperParent aquaticSiteType river Biosample env_local_scale "freshwater river” [ENVO:01000297] DP1.20281.001 amc_fieldSuperParent aquaticSiteType river Biosample env_medium “river water”, [ENVO:01000599] }
@aclum can this be closed now that Hugh has confirmed?
@aclum using this table to finish up the surface water ingest pipeline. I should have a JSON ready soon.
@aclum in the aquaticSiteType column you can see three types of values — lake, river and stream. What do you assign to the MIXS ENVO triad values when aquaticSiteType == “stream”?
The above table doesn't seem to have mappings for that case?
There may be a case that new ENVO terms would be required for the aquaticSiteType "stream".
Presumably @turbomam would be responsible for adding these terms to ENVO, so we should let him know at the earliest.
CC: @aclum
@turbomam is going to add new a new freshwater stream term to envo. https://github.com/EnvironmentOntology/envo/issues/1476
We have an EnvO PR for this:
There's a violation of best practice in the resulting branch. I'm 99.9 % sure I didn't add it. I personally wouldn't it want it added to a repo that I manage. So I'm waiting for feedback from @cmungall or Pier.
Having said that, I don't see why the reserved ID would change, so it's probably safe to start using this:
ENVO:03605007, 'freshwater stream biome'
Chris made some comments on Mark's PR so moving to the next sprint as in review.
Mark's PR was merged in. For streams we'll use env_broad_scale “freshwater stream biome” [ENVO:03605007] env_local_scale "freshwater stream" {ENVO:03605006] env_medium "stream water" [ENVO:03605006]
cc @sujaypatil96
Deliverable this task is associated with
_See Deliverables tab here: https://docs.google.com/spreadsheets/d/1jF1RU_TwQlJpqvHnKk-KieE6VlUv0je4eFddK9idPuE/edit?usp=sharing_
2
RACI
Tag people in their roles
Describe the the task?
Information will be added to mongo via Sujay's ingest code
Criteria for completion
Estimate people time
Completion Date (Goal)
Target Sprint Start & End Dates
Tag Blocker/Contingent upon issues