geological-survey-of-queensland / vocabularies

A collection of GSQ's vocabularies, formulated using SKOS, serialised as RDF (turtle) files.
Creative Commons Attribution 4.0 International
16 stars 16 forks source link

Update: Geological-Site #166

Closed MicaelaPreda closed 4 years ago

MicaelaPreda commented 4 years ago

Vance, As discussed, I checked Merlin's 'site' entries with Paul's and Mike's help. There are 65569 entries under 'exposures' which we consider a type of 'site'. Unfortunately, almost half (31036) aren't associated with a code. The remainder (34533) has associations with a variety of exposures but the vast majority (80%) are 'outcrop', 'mine' and 'stream' sites. Attached is the result and interpretation of a Discoverer search that may be useful in updating the 'site' vocabulary. MERLIN Site codes VK.xlsx

KellyVance commented 4 years ago

@MicaelaPreda

Thanks for the list. It highlights a lot of issues that have crept into MERLIN over time. Myself, and the GI team (@GSQ-AI, @LukeHauck) can help you get the necessary entries added into the vocabulary, and provide a review and guidance. But we don't have the subject matter expertise to make a lot of the detailed decisions on the content.

We need someone with a thorough understanding of these site types to rationalise the list before we can make additions.

The existing vocab can be seen here Or if you want you can view it in ttl format

MicaelaPreda commented 4 years ago

See attached a new version of the site vocab Paul, Mike and I worked on based on Vance's requirements. Site types Vocabs MM PB MP.xlsx

geoderekh commented 4 years ago

Great to see someone leading this vocab review! I only had time for a quick read this afternoon but I noticed a few issues with the aggregation of some terms, so I will give it a more thorough review tomorrow morning.

KellyVance commented 4 years ago

@MicaelaPreda @mckillopm @BlakePaul thanks for your review and the spreadsheet provided. I think its a major step forward. I'd done a quick comparison with the current geological-sites vocabulary and done some further matching. I think I align with >90% of your work.

Vocab needs to add Auger Hole, Quarry, and Unknown (NULL) I think all the other site types fit into an existing category. Broadly the same as your consolidation with some minor rearranging of concepts between outcrop, alluvial, and colluvial sites. Can you please check the below table and comment as needed, or catch me on Teams to discuss.

Existing Vocab MERLIN Equivalents Notes
Base Station   to delete?
Auger AUGER (split out from borehole) addition
Borehole DRILL  
Cutting ROAD, RAILW, TRACK, GUTTE, DRAIN  
Field Site    
. Alluvial Site ALLUV,  
.. Watercourse STREA  
.. Beach BEACH, SHORE,  
.. Waterbody    
. Colluvial Site SUBCR, RUBBL, FLOAT, BOULD, GULLY, SCREE, ROOTS  
. Outcrop OUTCR, RIDGE, PLATF, CLIFF, HILL, FALLS, CAPPI, ESCAR, GORGE, COAST, WHALE, CAVE, HEADL, MESA, PAVEM, TOR, SPILL  
. Soil Horizon SOIL  
. Tailings DUMP, PLANT  
Mine MINE  
. Open-Cut Mine OPENC  
. Underground Mine ?? addition
. Quarry QUARRY (split out from mine) addition
Mineral Deposit   Feature: GeoResource Accumulation
Mineral Occurrence   Site? Feature? Interpreted Geological property?
Pit EXCAV, GRAVE, SCRAP, PIT, PITS, DAM  
Trench TRENC, COSTE  
Project Site PROSP  
. Geophysical Survey Area to delete?
. Seismic Survey Area   to delete?
. Seismic Survey Line   to delete?
Wellbore Interval   remove
Wellbore   reinstate
Unknown NULL addition
BlakePaul commented 4 years ago

Hi Vance,

I have looked through the list below and the groupings look OK to me. I was wondering about Base Station. Could that have been used with geophysical surveys such as Gravity Surveys? And if it was could it be needed again? Maybe we need to ask somebody with a geophysics background.

Regards Paul

From: KellyVance notifications@github.com Sent: Wednesday, 29 April 2020 1:55 PM To: geological-survey-of-queensland/vocabularies Cc: BLAKE Paul; Mention Subject: Re: [geological-survey-of-queensland/vocabularies] Update: Geological-Site (#166)

@MicaelaPredahttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_MicaelaPreda&d=DwMFaQ&c=tpTxelpKGw9ZbZ5Dlo0lybSxHDHIiYjksG4icXfalgk&r=qRX8HqO1_UuI31pwgXAgdn_OgEMPETe1FODnBRsTde4&m=vIVhv-IIyCaNjmvVdPWXk5Cdk96FA8IgydnYdi2TtX0&s=e9wo2EFT20hbVH_YYU7YtZx656joLWLbXbeZ59fpoJg&e= @mckillopmhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_mckillopm&d=DwMFaQ&c=tpTxelpKGw9ZbZ5Dlo0lybSxHDHIiYjksG4icXfalgk&r=qRX8HqO1_UuI31pwgXAgdn_OgEMPETe1FODnBRsTde4&m=vIVhv-IIyCaNjmvVdPWXk5Cdk96FA8IgydnYdi2TtX0&s=AlPcYCJisSMyKqlkqRbENwNcwST8Hgt2oKIMiIU8WAA&e= @BlakePaulhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_BlakePaul&d=DwMFaQ&c=tpTxelpKGw9ZbZ5Dlo0lybSxHDHIiYjksG4icXfalgk&r=qRX8HqO1_UuI31pwgXAgdn_OgEMPETe1FODnBRsTde4&m=vIVhv-IIyCaNjmvVdPWXk5Cdk96FA8IgydnYdi2TtX0&s=tQBx4-VxtPOsxPlc4t_du1bImk2jlEgZkWQfVqzywKQ&e= thanks for your review and the spreadsheet provided. I think its a major step forward. I'd done a quick comparison with the current geological-sites vocabulary and done some further matching. I think I align with >90% of your work.

Vocab needs to add Auger Hole, Quarry, and Unknown (NULL) I think all the other site types fit into an existing category. Broadly the same as your consolidation with some minor rearranging of concepts between outcrop, alluvial, and colluvial sites. Can you please check the below table and comment as needed, or catch me on Teams to discuss. Existing Vocab MERLIN Equivalents Notes Base Station

to delete? Auger AUGER (split out from borehole) addition Borehole DRILL

Cutting ROAD, RAILW, TRACK, GUTTE, DRAIN

Field Site

. Alluvial Site ALLUV,

.. Watercourse STREA, GULLY,

.. Beach BEACH, SHORE,

.. Waterbody

. Colluvial Site SUBCR, RUBBL, FLOAT, BOULD, SCREE, ROOTS

. Outcrop OUTCR, RIDGE, PLATF, CLIFF, HILL, FALLS, CAPPI, ESCAR, GORGE, COAST, WHALE, CAVE, HEADL, MESA, PAVEM, TOR, SPILL

. Soil Horizon SOIL

. Tailings DUMP, PLANT

Mine MINE

. Open-Cut Mine OPENC

Quarry QUARRY (split out from mine) addition Mineral Deposit

Feature: GeoResource Accumulation Mineral Occurrence

Site? Feature? Interpreted Geological property? Pit EXCAV, GRAVE, SCRAP, PIT, PITS, DAM

Trench TRENC, COSTE

Project Site PROSP

. Geophysical Survey Area to delete? . Seismic Survey Area

to delete? . Seismic Survey Line

to delete? Wellbore Interval

remove Wellbore

reinstate Unknown NULL addition

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_geological-2Dsurvey-2Dof-2Dqueensland_vocabularies_issues_166-23issuecomment-2D620975859&d=DwMFaQ&c=tpTxelpKGw9ZbZ5Dlo0lybSxHDHIiYjksG4icXfalgk&r=qRX8HqO1_UuI31pwgXAgdn_OgEMPETe1FODnBRsTde4&m=vIVhv-IIyCaNjmvVdPWXk5Cdk96FA8IgydnYdi2TtX0&s=E88z0QQef0PodCpKY4q_D10w9s-mB7p7b-aoH6YKfzQ&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_APDQOLFEE7WLKXAVXRYFVG3RO6QIBANCNFSM4MDSGVHQ&d=DwMFaQ&c=tpTxelpKGw9ZbZ5Dlo0lybSxHDHIiYjksG4icXfalgk&r=qRX8HqO1_UuI31pwgXAgdn_OgEMPETe1FODnBRsTde4&m=vIVhv-IIyCaNjmvVdPWXk5Cdk96FA8IgydnYdi2TtX0&s=Uu6fBNgFtT5j7f889mMiPpjo0-Y98YXEIHq3TS_quJ8&e=.


The information in this email together with any attachments is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. There is no waiver of any confidentiality/privilege by your inadvertent receipt of this material. Any form of review, disclosure, modification, distribution and/or publication of this email message is prohibited, unless as a necessary part of Departmental business. If you have received this message in error, you are asked to inform the sender as quickly as possible and delete this message and any copies of this message from your computer and/or your computer system network.

KellyVance commented 4 years ago

After talking with @CantRoger I believe Base Stations probably justify remaining as their own site type. Particularly to accommodate permanent Gravity base stations.

geoderekh commented 4 years ago

Grouping concepts like this is a race to the bottom and we are losing important information. I'm a field geologist and I need to specify narrower concepts than just field site or outcrop. Apart from helping to find old sites in the field, it makes a big difference to the confidence in any field data knowing something comes from a cliff, or just an outcrop on a hillside. Unless you want to recreate the lost concepts as results corresponding to an observation_types=site_exposure_type, site_geomorphic expression etc. While the existing exposure_type table is not great, imo just adding all the terms would be better than this excessively reductive path that has been proposed.

I agree that some aggregation is required, but I feel the grouping here has become excessive and occurred without reference to the basic definitions of these terms. For example, subcrop is not the same as float. Also, float is float whether it accumulates at the base of topography (i.e. colluvium) or not. Depending on where the tree is, roots could be alluvium, colluvium, float, in a gully... Spill is a dam spillway, so it might be better put with the other anthropogenic exposures. And putting a gutter (probably better grouped with subcrop) in the same class as a railway siding is nonsensical.

@CourteneyD care to weigh in, especially on the various minocc-ish exposure types?

KellyVance commented 4 years ago

While I take your point on granularity there is an element of practicality here. I don't think we are losing important information, i think we are losing inconsistent and dubious information. Not everyone is as thorough or consistent in their descriptions as we would like, or you might be. In fact most people are far from the perfectionists we might hope they should be. Yes the definitions and groupings may need review to ensure the groups are as sensible as they can be given the uncertainties in MERLIN. But generalisation is absolutely necessary at the database level we are describing to primarily find information, find resources, and provide confidence in consistent results.

If you look at the MERLIN counts there are 15573 sites listed simply as outcrop There are 13 sites on isolated hills. If you gave 10 geologists the task to look through those 13 hill sites i guarantee there would be an argument on what to consider 'isolated'. At least one of those geologists would look at a small vertical section in a hill and say one of the sample sites should be reclassified as a cliff. Of the 15573 sites listed as outcrop i guarantee a bunch of those would fit the 'isolated hill' category if you gave them to those same geologists.

i.e. of the specific site types there are probably a handful of false positives in each and hundreds (if not thousands) of false negatives where they are already upscaled simply to outcrop or mine. If the information were that important to the geologic community at large we wouldn't have 15k generic outcrop sites (or 30k null values).

The information is only robust and useful if its defining parameters are consistently applied at the level of granularity it claims to have. Otherwise it is misleading, inaccurate, and invariably leads to the kind of terminology creep that has permeated throughout MERLIN. Hence I strongly advocate for pushing this up to a level that is useful and at least has a chance of consistent application.

In principal I prefer better granularity but only IF consistency can be guaranteed to make it accurate and provide confidence that it represents the actual information. To that end I suggest someone volunteer their time to review the 31k sites and break the consolidated categories out into specific sub-types of mines, outcrops, etc or we upscale to sensible level as discussed (but the former would be a laborious task that we realistically do not have time for right now).

CourteneyD commented 4 years ago

While I take your point on granularity there is an element of practicality here. I don't think we are losing important information, i think we are losing inconsistent and dubious information. Not everyone is as thorough or consistent in their descriptions as we would like, or you might be. In fact most people are far from the perfectionists we might hope they should be. Yes the definitions and groupings may need review to ensure the groups are as sensible as they can be given the uncertainties in MERLIN. But generalisation is absolutely necessary at the database level we are describing to primarily find information, find resources, and provide confidence in consistent results.

If you look at the MERLIN counts there are 15573 sites listed simply as outcrop There are 13 sites on isolated hills. If you gave 10 geologists the task to look through those 13 hill sites i guarantee there would be an argument on what to consider 'isolated'. At least one of those geologists would look at a small vertical section in a hill and say one of the sample sites should be reclassified as a cliff. Of the 15573 sites listed as outcrop i guarantee a bunch of those would fit the 'isolated hill' category if you gave them to those same geologists.

i.e. of the specific site types there are probably a handful of false positives in each and hundreds (if not thousands) of false negatives where they are already upscaled simply to outcrop or mine. If the information were that important to the geologic community at large we wouldn't have 15k generic outcrop sites (or 30k null values).

The information is only robust and useful if its defining parameters are consistently applied at the level of granularity it claims to have. Otherwise it is misleading, inaccurate, and invariably leads to the kind of terminology creep that has permeated throughout MERLIN. Hence I strongly advocate for pushing this up to a level that is useful and at least has a chance of consistent application.

In principal I prefer better granularity but only IF consistency can be guaranteed to make it accurate and provide confidence that it represents the actual information. To that end I suggest someone volunteer their time to review the 31k sites and break the consolidated categories out into specific sub-types of mines, outcrops, etc or we upscale to sensible level as discussed (but the former would be a laborious task that we realistically do not have time for right now).

Hi all,

After going through the assessment of existing site types (thanks Micaela, Paul and Mike) and applying them to our existing vocab I think generally things look ok.

A few things to point out though: 1) The "PROSPECT' code is a terrible existing code and I don't think should map to Project. Prospect is something that should be covered with whatever Mine and Exploration status goes to. It's a temporal thing - something that was recorded as a prospect 20 years ago isn't now? The issue is that these are not neccessarily a site from an outcrop or a drill site and I think should be moved to UNK until the data is cleaned up. 2) A lot of codes with 'MINE' as the exposure type are actually 'ALLUVIAL' - this will need to get cleaned up as well. 3) I'd split Dam from Pit - it's more a water body than a pit for exploration. 4) Mineral Deposit is a feature for sure. A mineral occurrence is more like a traditional SG site and mostly an 'OUTCROP' exposure type or a 'DRILL' exposure type. It's important to capture that it is a mineral occurrence and is outcrop, drill etc. Any ideas on how we deal with that?

KellyVance commented 4 years ago

Based on comments from @CourteneyD see the below remainder and edits (have attempted to indicate hierarchy levels with preceding full stops)

Existing Vocab MERLIN Equivalents Notes
Auger AUGER addition (split out from borehole)
Base Station  
Borehole DRILL  
Cutting ROAD, RAILW, TRACK, GUTTE, DRAIN  
Field Site    
. Alluvial Site ALLUV,  
.. Watercourse STREA  
.. Beach BEACH, SHORE,  
.. Waterbody DAM   
. Colluvial Site SUBCR, RUBBL, FLOAT, BOULD, SCREE, ROOTS, GULLY  
. Outcrop OUTCR, RIDGE, PLATF, CLIFF, HILL, FALLS, CAPPI, ESCAR, GORGE, COAST, WHALE, CAVE, HEADL, MESA, PAVEM, TOR, SPILL   To remediate and reclassify data after build and migration
. Soil Horizon SOIL  
. Tailings DUMP, PLANT  
Mine MINE  To remediate and reclassify data after build and migration
. Open-Cut Mine OPENC  
Pit EXCAV, GRAVE, SCRAP, PIT, PITS  
Project Site  
Quarry QUARRY addition (split out from mine)
Trench TRENC, COSTE  
Wellbore   reinstate
Unknown NULL, PROSP addition