Closed MicaelaPreda closed 4 years ago
@MicaelaPreda
Thanks for the list. It highlights a lot of issues that have crept into MERLIN over time. Myself, and the GI team (@GSQ-AI, @LukeHauck) can help you get the necessary entries added into the vocabulary, and provide a review and guidance. But we don't have the subject matter expertise to make a lot of the detailed decisions on the content.
We need someone with a thorough understanding of these site types to rationalise the list before we can make additions.
The existing vocab can be seen here Or if you want you can view it in ttl format
See attached a new version of the site vocab Paul, Mike and I worked on based on Vance's requirements. Site types Vocabs MM PB MP.xlsx
Great to see someone leading this vocab review! I only had time for a quick read this afternoon but I noticed a few issues with the aggregation of some terms, so I will give it a more thorough review tomorrow morning.
@MicaelaPreda @mckillopm @BlakePaul thanks for your review and the spreadsheet provided. I think its a major step forward. I'd done a quick comparison with the current geological-sites vocabulary and done some further matching. I think I align with >90% of your work.
Vocab needs to add Auger Hole, Quarry, and Unknown (NULL) I think all the other site types fit into an existing category. Broadly the same as your consolidation with some minor rearranging of concepts between outcrop, alluvial, and colluvial sites. Can you please check the below table and comment as needed, or catch me on Teams to discuss.
Existing Vocab | MERLIN Equivalents | Notes |
---|---|---|
Base Station | to delete? | |
Auger | AUGER (split out from borehole) | addition |
Borehole | DRILL | |
Cutting | ROAD, RAILW, TRACK, GUTTE, DRAIN | |
Field Site | ||
. Alluvial Site | ALLUV, | |
.. Watercourse | STREA | |
.. Beach | BEACH, SHORE, | |
.. Waterbody | ||
. Colluvial Site | SUBCR, RUBBL, FLOAT, BOULD, GULLY, SCREE, ROOTS | |
. Outcrop | OUTCR, RIDGE, PLATF, CLIFF, HILL, FALLS, CAPPI, ESCAR, GORGE, COAST, WHALE, CAVE, HEADL, MESA, PAVEM, TOR, SPILL | |
. Soil Horizon | SOIL | |
. Tailings | DUMP, PLANT | |
Mine | MINE | |
. Open-Cut Mine | OPENC | |
. Underground Mine | ?? | addition |
. Quarry | QUARRY (split out from mine) | addition |
Mineral Deposit | Feature: GeoResource Accumulation | |
Mineral Occurrence | Site? Feature? Interpreted Geological property? | |
Pit | EXCAV, GRAVE, SCRAP, PIT, PITS, DAM | |
Trench | TRENC, COSTE | |
Project Site | PROSP | |
. Geophysical Survey Area | to delete? | |
. Seismic Survey Area | to delete? | |
. Seismic Survey Line | to delete? | |
Wellbore Interval | remove | |
Wellbore | reinstate | |
Unknown | NULL | addition |
Hi Vance,
I have looked through the list below and the groupings look OK to me. I was wondering about Base Station. Could that have been used with geophysical surveys such as Gravity Surveys? And if it was could it be needed again? Maybe we need to ask somebody with a geophysics background.
Regards Paul
From: KellyVance notifications@github.com Sent: Wednesday, 29 April 2020 1:55 PM To: geological-survey-of-queensland/vocabularies Cc: BLAKE Paul; Mention Subject: Re: [geological-survey-of-queensland/vocabularies] Update: Geological-Site (#166)
@MicaelaPredahttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_MicaelaPreda&d=DwMFaQ&c=tpTxelpKGw9ZbZ5Dlo0lybSxHDHIiYjksG4icXfalgk&r=qRX8HqO1_UuI31pwgXAgdn_OgEMPETe1FODnBRsTde4&m=vIVhv-IIyCaNjmvVdPWXk5Cdk96FA8IgydnYdi2TtX0&s=e9wo2EFT20hbVH_YYU7YtZx656joLWLbXbeZ59fpoJg&e= @mckillopmhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_mckillopm&d=DwMFaQ&c=tpTxelpKGw9ZbZ5Dlo0lybSxHDHIiYjksG4icXfalgk&r=qRX8HqO1_UuI31pwgXAgdn_OgEMPETe1FODnBRsTde4&m=vIVhv-IIyCaNjmvVdPWXk5Cdk96FA8IgydnYdi2TtX0&s=AlPcYCJisSMyKqlkqRbENwNcwST8Hgt2oKIMiIU8WAA&e= @BlakePaulhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_BlakePaul&d=DwMFaQ&c=tpTxelpKGw9ZbZ5Dlo0lybSxHDHIiYjksG4icXfalgk&r=qRX8HqO1_UuI31pwgXAgdn_OgEMPETe1FODnBRsTde4&m=vIVhv-IIyCaNjmvVdPWXk5Cdk96FA8IgydnYdi2TtX0&s=tQBx4-VxtPOsxPlc4t_du1bImk2jlEgZkWQfVqzywKQ&e= thanks for your review and the spreadsheet provided. I think its a major step forward. I'd done a quick comparison with the current geological-sites vocabulary and done some further matching. I think I align with >90% of your work.
Vocab needs to add Auger Hole, Quarry, and Unknown (NULL) I think all the other site types fit into an existing category. Broadly the same as your consolidation with some minor rearranging of concepts between outcrop, alluvial, and colluvial sites. Can you please check the below table and comment as needed, or catch me on Teams to discuss. Existing Vocab MERLIN Equivalents Notes Base Station
to delete? Auger AUGER (split out from borehole) addition Borehole DRILL
Cutting ROAD, RAILW, TRACK, GUTTE, DRAIN
Field Site
. Alluvial Site ALLUV,
.. Watercourse STREA, GULLY,
.. Beach BEACH, SHORE,
.. Waterbody
. Colluvial Site SUBCR, RUBBL, FLOAT, BOULD, SCREE, ROOTS
. Outcrop OUTCR, RIDGE, PLATF, CLIFF, HILL, FALLS, CAPPI, ESCAR, GORGE, COAST, WHALE, CAVE, HEADL, MESA, PAVEM, TOR, SPILL
. Soil Horizon SOIL
. Tailings DUMP, PLANT
Mine MINE
. Open-Cut Mine OPENC
Quarry QUARRY (split out from mine) addition Mineral Deposit
Feature: GeoResource Accumulation Mineral Occurrence
Site? Feature? Interpreted Geological property? Pit EXCAV, GRAVE, SCRAP, PIT, PITS, DAM
Trench TRENC, COSTE
Project Site PROSP
. Geophysical Survey Area to delete? . Seismic Survey Area
to delete? . Seismic Survey Line
to delete? Wellbore Interval
remove Wellbore
reinstate Unknown NULL addition
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_geological-2Dsurvey-2Dof-2Dqueensland_vocabularies_issues_166-23issuecomment-2D620975859&d=DwMFaQ&c=tpTxelpKGw9ZbZ5Dlo0lybSxHDHIiYjksG4icXfalgk&r=qRX8HqO1_UuI31pwgXAgdn_OgEMPETe1FODnBRsTde4&m=vIVhv-IIyCaNjmvVdPWXk5Cdk96FA8IgydnYdi2TtX0&s=E88z0QQef0PodCpKY4q_D10w9s-mB7p7b-aoH6YKfzQ&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_APDQOLFEE7WLKXAVXRYFVG3RO6QIBANCNFSM4MDSGVHQ&d=DwMFaQ&c=tpTxelpKGw9ZbZ5Dlo0lybSxHDHIiYjksG4icXfalgk&r=qRX8HqO1_UuI31pwgXAgdn_OgEMPETe1FODnBRsTde4&m=vIVhv-IIyCaNjmvVdPWXk5Cdk96FA8IgydnYdi2TtX0&s=Uu6fBNgFtT5j7f889mMiPpjo0-Y98YXEIHq3TS_quJ8&e=.
After talking with @CantRoger I believe Base Stations probably justify remaining as their own site type. Particularly to accommodate permanent Gravity base stations.
Grouping concepts like this is a race to the bottom and we are losing important information. I'm a field geologist and I need to specify narrower concepts than just field site or outcrop. Apart from helping to find old sites in the field, it makes a big difference to the confidence in any field data knowing something comes from a cliff, or just an outcrop on a hillside. Unless you want to recreate the lost concepts as results corresponding to an observation_types=site_exposure_type, site_geomorphic expression etc. While the existing exposure_type table is not great, imo just adding all the terms would be better than this excessively reductive path that has been proposed.
I agree that some aggregation is required, but I feel the grouping here has become excessive and occurred without reference to the basic definitions of these terms. For example, subcrop is not the same as float. Also, float is float whether it accumulates at the base of topography (i.e. colluvium) or not. Depending on where the tree is, roots could be alluvium, colluvium, float, in a gully... Spill is a dam spillway, so it might be better put with the other anthropogenic exposures. And putting a gutter (probably better grouped with subcrop) in the same class as a railway siding is nonsensical.
@CourteneyD care to weigh in, especially on the various minocc-ish exposure types?
While I take your point on granularity there is an element of practicality here. I don't think we are losing important information, i think we are losing inconsistent and dubious information. Not everyone is as thorough or consistent in their descriptions as we would like, or you might be. In fact most people are far from the perfectionists we might hope they should be. Yes the definitions and groupings may need review to ensure the groups are as sensible as they can be given the uncertainties in MERLIN. But generalisation is absolutely necessary at the database level we are describing to primarily find information, find resources, and provide confidence in consistent results.
If you look at the MERLIN counts there are 15573 sites listed simply as outcrop There are 13 sites on isolated hills. If you gave 10 geologists the task to look through those 13 hill sites i guarantee there would be an argument on what to consider 'isolated'. At least one of those geologists would look at a small vertical section in a hill and say one of the sample sites should be reclassified as a cliff. Of the 15573 sites listed as outcrop i guarantee a bunch of those would fit the 'isolated hill' category if you gave them to those same geologists.
i.e. of the specific site types there are probably a handful of false positives in each and hundreds (if not thousands) of false negatives where they are already upscaled simply to outcrop or mine. If the information were that important to the geologic community at large we wouldn't have 15k generic outcrop sites (or 30k null values).
The information is only robust and useful if its defining parameters are consistently applied at the level of granularity it claims to have. Otherwise it is misleading, inaccurate, and invariably leads to the kind of terminology creep that has permeated throughout MERLIN. Hence I strongly advocate for pushing this up to a level that is useful and at least has a chance of consistent application.
In principal I prefer better granularity but only IF consistency can be guaranteed to make it accurate and provide confidence that it represents the actual information. To that end I suggest someone volunteer their time to review the 31k sites and break the consolidated categories out into specific sub-types of mines, outcrops, etc or we upscale to sensible level as discussed (but the former would be a laborious task that we realistically do not have time for right now).
While I take your point on granularity there is an element of practicality here. I don't think we are losing important information, i think we are losing inconsistent and dubious information. Not everyone is as thorough or consistent in their descriptions as we would like, or you might be. In fact most people are far from the perfectionists we might hope they should be. Yes the definitions and groupings may need review to ensure the groups are as sensible as they can be given the uncertainties in MERLIN. But generalisation is absolutely necessary at the database level we are describing to primarily find information, find resources, and provide confidence in consistent results.
If you look at the MERLIN counts there are 15573 sites listed simply as outcrop There are 13 sites on isolated hills. If you gave 10 geologists the task to look through those 13 hill sites i guarantee there would be an argument on what to consider 'isolated'. At least one of those geologists would look at a small vertical section in a hill and say one of the sample sites should be reclassified as a cliff. Of the 15573 sites listed as outcrop i guarantee a bunch of those would fit the 'isolated hill' category if you gave them to those same geologists.
i.e. of the specific site types there are probably a handful of false positives in each and hundreds (if not thousands) of false negatives where they are already upscaled simply to outcrop or mine. If the information were that important to the geologic community at large we wouldn't have 15k generic outcrop sites (or 30k null values).
The information is only robust and useful if its defining parameters are consistently applied at the level of granularity it claims to have. Otherwise it is misleading, inaccurate, and invariably leads to the kind of terminology creep that has permeated throughout MERLIN. Hence I strongly advocate for pushing this up to a level that is useful and at least has a chance of consistent application.
In principal I prefer better granularity but only IF consistency can be guaranteed to make it accurate and provide confidence that it represents the actual information. To that end I suggest someone volunteer their time to review the 31k sites and break the consolidated categories out into specific sub-types of mines, outcrops, etc or we upscale to sensible level as discussed (but the former would be a laborious task that we realistically do not have time for right now).
Hi all,
After going through the assessment of existing site types (thanks Micaela, Paul and Mike) and applying them to our existing vocab I think generally things look ok.
A few things to point out though: 1) The "PROSPECT' code is a terrible existing code and I don't think should map to Project. Prospect is something that should be covered with whatever Mine and Exploration status goes to. It's a temporal thing - something that was recorded as a prospect 20 years ago isn't now? The issue is that these are not neccessarily a site from an outcrop or a drill site and I think should be moved to UNK until the data is cleaned up. 2) A lot of codes with 'MINE' as the exposure type are actually 'ALLUVIAL' - this will need to get cleaned up as well. 3) I'd split Dam from Pit - it's more a water body than a pit for exploration. 4) Mineral Deposit is a feature for sure. A mineral occurrence is more like a traditional SG site and mostly an 'OUTCROP' exposure type or a 'DRILL' exposure type. It's important to capture that it is a mineral occurrence and is outcrop, drill etc. Any ideas on how we deal with that?
Based on comments from @CourteneyD see the below remainder and edits (have attempted to indicate hierarchy levels with preceding full stops)
Existing Vocab | MERLIN Equivalents | Notes |
---|---|---|
Auger | AUGER | addition (split out from borehole) |
Base Station | ||
Borehole | DRILL | |
Cutting | ROAD, RAILW, TRACK, GUTTE, DRAIN | |
Field Site | ||
. Alluvial Site | ALLUV, | |
.. Watercourse | STREA | |
.. Beach | BEACH, SHORE, | |
.. Waterbody | DAM | |
. Colluvial Site | SUBCR, RUBBL, FLOAT, BOULD, SCREE, ROOTS, GULLY | |
. Outcrop | OUTCR, RIDGE, PLATF, CLIFF, HILL, FALLS, CAPPI, ESCAR, GORGE, COAST, WHALE, CAVE, HEADL, MESA, PAVEM, TOR, SPILL | To remediate and reclassify data after build and migration |
. Soil Horizon | SOIL | |
. Tailings | DUMP, PLANT | |
Mine | MINE | To remediate and reclassify data after build and migration |
. Open-Cut Mine | OPENC | |
Pit | EXCAV, GRAVE, SCRAP, PIT, PITS | |
Project Site | ||
Quarry | QUARRY | addition (split out from mine) |
Trench | TRENC, COSTE | |
Wellbore | reinstate | |
Unknown | NULL, PROSP | addition |
Vance, As discussed, I checked Merlin's 'site' entries with Paul's and Mike's help. There are 65569 entries under 'exposures' which we consider a type of 'site'. Unfortunately, almost half (31036) aren't associated with a code. The remainder (34533) has associations with a variety of exposures but the vast majority (80%) are 'outcrop', 'mine' and 'stream' sites. Attached is the result and interpretation of a Discoverer search that may be useful in updating the 'site' vocabulary. MERLIN Site codes VK.xlsx