Open KellyVance opened 5 years ago
QDEX | New | Review - VK | Review - AI |
---|---|---|---|
Report Title | Title | Makes sense | |
Report Type | Resources Industry Report Type | yes | So we're running with an entirely separate vocab for non-resource industry reports? I'm perfectly okay with keeping "Report Type" |
Author Name | dct:creator | yes | yes |
Lodger | Not required | yes | yes |
Submitter | System recorded from logged-in user | yes | yes |
Locality | dct:Location | yes | yes |
Map References | Not required - can be derived using spatial intersect | yes | yes |
Commodity | Not required | Infer from report content, but is essential info | Ideally derived from content, but heavily dependent on the capability of the final product. I'd be fine with it being a manually entered field. |
Keywords | Not used | yes... but there are use cases where keywords are valuable, so this is dependent on what level of detail within the data can be interrogated during search | I can see a lot of uses for this where there aren't implicit or inherited connections. This cannot be a manually populated field, and the majority of the important information should be collected explicitly in other fields, but, maybe as a result of machine learning we populate a list of keywords? Is there something fancy that replaces this functionality that I'm not remembering/aware of? |
Tenure | Queensland Mining Permit | yes | No. There's official language regarding this that we should tow the line on. |
Tectonic | Not required - can be derived using spatial intersect | yes | yes |
Stratigraphy | Not required - can be derived using spatial intersect | No. Not necessarily a spatial intersect. But should be captured in report data templates | Not necessarily spatial, at time of report submission they may in fact be defining new strat, or redefining old strat, with historical reports they refer to strat which has in fact been 'moved' since writing. |
Age | Not required - can be derived using spatial intersect | No. but can be linked to via stratigraphy | Inherited from strat/basin |
Date of Report | time:ProperInterval | yes | yes |
Date of Receipt | System recorded dct:created | yes | yes |
Project Names | Not recorded | Probably useful data. Should be derivable from report content | Highly relevant for minerals |
Mines/Prospect Names | Feature of Interest | yes? maybe? | I guess? We could probably roll up Project Names and Well Names into this tag, but it feels a bit grab-bag-y |
Well Names | Not recorded | link to GSQ Borehole Profile via Spatial intersect | Very relevant to non-industry reports, some industry reports |
Seismic Survey Names | GSQ Survey Profile | yes | yes |
Report Title | Title | Review |
---|---|---|
Report Title | Title | Makes sense |
Report Type | Resources Industry Report Type | So we're running with an entirely separate vocab for non-resource industry reports? I'm perfectly okay with keeping "Report Type" |
Author Name | dct:creator | yes |
Lodger | Not required | yes |
Submitter | System recorded from logged-in user | yes |
Locality | dct:Location | yes |
Map References | Not required - can be derived using spatial intersect | yes |
Commodity | Not required | Ideally derived from content, but heavily dependent on the capability of the final product. I'd be fine with it being a manually entered field. |
Keywords | Not used | I can see a lot of uses for this where there aren't implicit or inherited connections. This cannot be a manually populated field, and the majority of the important information should be collected explicitly in other fields, but, maybe as a result of machine learning we populate a list of keywords? Is there something fancy that replaces this functionality that I'm not remembering/aware of? |
Tenure | Queensland Mining Permit | No. There's official language regarding this that we should tow the line on. |
Tectonic | Not required - can be derived using spatial intersect | yes |
Stratigraphy | Not required - can be derived using spatial intersect | Not necessarily spatial, at time of report submission they may in fact be defining new strat, or redefining old strat, with historical reports they refer to strat which has in fact been 'moved' since writing. |
Age | Not required - can be derived using spatial intersect | Inherited from strat/basin |
Date of Report | time:ProperInterval | yes |
Date of Receipt | System recorded dct:created | yes |
Project Names | Not recorded | Highly relevant for minerals |
Mines/Prospect Names | Feature of Interest | I guess? We could probably roll up Project Names and Well Names into this tag, but it feels a bit grab-bag-y |
Well Names | Not recorded | Very relevant to non-industry reports, some industry reports |
Seismic Survey Names | GSQ Survey Profile | yes |
Noted - will review and reply
@KellyVance @GSQ-AI @johnkirsten - please review my comments below.
Commodity - Ok to put back into model.
Keywords - We can create a controlled list of keywords, e.g. by starting with the keywords currently populated in QDEX Reports. When I had a quick look at the database records yesterday, there appeared to be a lot of keyword stuffing and some of the keywords were duplicative of the report type. Also, if we capture 'Earth Science Data Categories' on the submission form, this may obviate the need for some keywords. Let's do some better analysis of the QDEX Reports keywords and look at their usefulness. Let's decide on what is suitable as a keyword and what should be core metadata for the report.
Tenure - @GSQ-AI can you please provide the document that defines "There's official language regarding this that we should tow the line on." When I spoke with Jodie Hendey, she said that they tried to bring in "Resource Authority" but it never stuck. She said to use "permit". Years ago, they dropped "tenure" as it implied ownership of the land (at the time of contention regarding CSG and farm land access). To my knowledge, there is no collective term in the legislation, they refer to the specific permits, leases, authorities, etc.
Stratigraphy - so, will this be a vocabulary? or is there already a trusted source for this data (e.g. in DMEGeo?). What is the cardinality between report and stratigraphy? Does there need to be functionality for the user to create new stratigraphy? Or is this controlled?
Project Names - to my understanding, there is no controlled list of project names (the Coal Hub manages their own list of project names). So, will this be free text? How do we get data integrity? Ideally, we would have linkage between projects and their permits.
Mines/Prospect Names - is this then two separate metadata fields? Are these controlled lists? I would have thought so for mines, but new prospect names could be created?
Well Names - is this for P&G only? Will these well names already be in our borehole register? If not, the user would free-text the entry. But I would think we would want to have integrity of the well - entering in the core metadata in our borehole data model. For P&G reports, the report itself will contain the well names.
@GSQ-AI @KellyVance We need to decide on the geometry that we are going to capture for industry reports. There current form captures both locality and map sheet. When I had a brief look at the data yesterday, the locality was very broad (there was even "Queensland") listed.
For the reports now in QDEX Reports that we will migrate to the new system, how about:
For new report lodgement, @KellyVance did you already have a plan to capture coverage of the report?
@ajtroup Can you please review this issue - particularly looking at comments:
and
@ajtroup This is in reference to the QDEX mapping table for the Industry Report Profile - see https://github.com/geological-survey-of-queensland/industry-report-profile
Here's some more thoughts to add to the headache...
QDEX | New | Review AT |
---|---|---|
Report Title | Title | Yes |
Report Type | Resources Industry Report Type | Report type is fine, not all the reports are resources industry reports. |
Author Name | dct:creator | See discussion in email regarding whether this should be author or company |
Lodger | Not required | Should be recorded, but doesn't need to be searchable or displayed. Has proven useful where there have been issues noted in the past |
Submitter | System recorded from logged-in user | So Lodger and Submitter are being merged? Currently submitter is the Company, where Lodger is the person physically (digitally) submitting the report |
Locality | dct:Location | yes |
Map References | Not required - can be derived using spatial intersect | Could be a useful QA for the intersect process |
Commodity | Not required | Can be inferred from tenure, but not completely, and depends on the granularity. Inferred from content would be great, but difficult for scanned reports. |
Keywords | Not used | Keywords are definitely used when searching |
Tenure | Queensland Mining Permit | Isn't Queensland Mining Permit only one type of tenure? Or have they tried to make it one type of tenure? Will need translations between EPP, A-P, exploration permit, mining permit, mining tenure, mining lease, EPM, MDL,et al. There are or should already be a vocab we can use adapted from the current classification system. |
Tectonic | Not required - can be derived using spatial intersect | Can be implied from spaitial intersect, but spatial intersect doesn’t deal with depth relationship |
Stratigraphy | Not required - can be derived using spatial intersect | Stratigraphy can't be derived from spatial intersect, but could be gathered from other parts of a report going forward. |
Age | Not required - can be derived using spatial intersect | Can be derived from stratigraphy and should be associated with stratigraphy - keep in mind this could be a very wide range |
Date of Report | time:ProperInterval | yes |
Date of Receipt | System recorded dct:created | yes |
Project Names | Not recorded | Well names important, not sure about seismic name, should be using it for minerals and coal and potentially for some of the CSG fields where naming conventions have shifted over time (e.g. Fairview to FV |
Mines/Prospect Names | Feature of Interest | |
Well Names | Not recorded | This is the major link point to the well at the moment (is used for the QDEX Data to QDEX reports link to the best of my knowledge. Currently useful as it is as close to searching by UWI as you can get in QDEX Reports, as the title is a string search and suffers from inconsistent naming. |
Seismic Survey Names | GSQ Survey Profile | What is the survey profile? |
Keywords - The keywords are an opportunity to tag the report with broad content categories that are more granular than the report type. For example, I don’t want to search for all well completion reports, I want to find the ones with core logs, I would use the keywords. Challenge is sorting out how to apply relevant keywords without having to read through > 100,000 reports…
Tenure The reason that terminology didn’t stick is that the project didn’t go ahead (for a variety of reasons). P&G companies definitely still use tenure. Personally not sure if minerals and coal do, but I wouldn’t be surprised. Permit is ok, but you’ll need to be able to translate it for use. Also see ATP, EPP, A-P, ML, PL, MDL, EPC, EPM, … I don’t like the use of ‘mining permit’, as it is restrictive to certain sections of industry as well as really only referring to one (maybe two) types of tenure – the ML and the MDL. Stratigraphy - so, will this be a vocabulary? or is there already a trusted source for this data (e.g. in DMEGeo?). What is the cardinality between report and stratigraphy? @dxwell – what do you mean by this? A report will contain the section of stratigraphy that it has intersected. Stratigraphic units should be as per the ASUD (which I think was being used for the vocab?), but there should probably be some aspect of a company able to propose new units, or report on subdivisions of a unit (for petroleum wells, a particular section of reservoir within a formation, for coal – I’m not sure if the coal seams are listed as official stratigraphic units. So, should be controlled, but with a function for a user to propose new stratigraphy. Project Names No reason why we can’t adopt the coal hub projects. Can see a use for this in grouping coal seam gas projects where well names have changed over time (e.g. Fairview to FV) Well Names - is this for P&G only? Nope. Also strat, with potential for other commodities against sampling reports. They should already be in the borehole register. And yes, the report will contain the well name, but this is the report metadata and the well should be attributed to the report.
I was getting a bit lost with the thread. I've tabularised the comments so far and added a few thoughts @DavidCrosswellGSQ
QDEX | New | Review - VK | Review - AI | Reply - DC | Review AT | Reply - VK |
---|---|---|---|---|---|---|
Report Title | Title | Makes sense | Yes | YES | ||
Report Type | Resources Industry Report Type | yes | So we're running with an entirely separate vocab for non-resource industry reports? I'm perfectly okay with keeping "Report Type" | Report type is fine, not all the reports are resources industry reports. | YES - but change back to Report Type | |
Author Name | dct:creator | yes | yes | See discussion in email regarding whether this should be author or company | Opinion - Who wrote or compiled Report | |
Lodger | Not required | yes | yes | Should be recorded, but doesn't need to be searchable or displayed. Has proven useful where there have been issues noted in the past | Opinion - Who (person) lodged the report | |
Submitter | System recorded from logged-in user | yes | yes | So Lodger and Submitter are being merged? Currently submitter is the Company, where Lodger is the person physically (digitally) submitting the report | Opinion - Company who submits report | |
Locality | dct:Location | yes | yes | We need to decide on the geometry that we are going to capture for industry reports. There current form captures both locality and map sheet. When I had a brief look at the data yesterday, the locality was very broad (there was even "Queensland") listed. For the reports now in QDEX Reports that we will migrate to the new system, how about: If the report is a permit-based report, we create the geometry based on the permit shape at the date of lodgement of the permit. If the report is not a permit-based report, then we: a. If a shape file has been submitted with the report, then we use that. @johnkirsten can you please check to see if any have been submitted. b. Else, we create a shape based on the map sheet(s). @johnkirsten can you please check if there is a 1:1 or 1:* report to map sheet captured. For new report lodgement, @KellyVance did you already have a plan to capture coverage of the report? | yes | Reports by industry are always going to be associated with a specific activity or a group of activities on a permit. As far as their location represented at surface i.e. wells - point 2D seismic - set of lines 3D seismic -polygon Tenure based report - Polygon of specific permit at time of report For non-industry reports we should have some kind of polygon or maximum extent of the activity or study being done. For GA/CSIRO/Academic reports they may cross state boundaries. If we have nothing we should do our best to assign either Queensland or Australia... but this should be a final resort. I'm not sure of anything other than the actual mapsheet itself that should reference the Mapsheet extent as its primary spatial representation, it should really just be an intersect. |
Map References | Not required - can be derived using spatial intersect | yes | yes | Could be a useful QA for the intersect process | See above. Just use an intersect. | |
Commodity | Not required | Infer from report content, but is essential info | Ideally derived from content, but heavily dependent on the capability of the final product. I'd be fine with it being a manually entered field. | Ok to put back into model. | Can be inferred from tenure, but not completely, and depends on the granularity. Inferred from content would be great, but difficult for scanned reports. | Reintroduce |
Keywords | Not used | yes... but there are use cases where keywords are valuable, so this is dependent on what level of detail within the data can be interrogated during search | I can see a lot of uses for this where there aren't implicit or inherited connections. This cannot be a manually populated field, and the majority of the important information should be collected explicitly in other fields, but, maybe as a result of machine learning we populate a list of keywords? Is there something fancy that replaces this functionality that I'm not remembering/aware of? | We can create a controlled list of keywords, e.g. by starting with the keywords currently populated in QDEX Reports. When I had a quick look at the database records yesterday, there appeared to be a lot of keyword stuffing and some of the keywords were duplicative of the report type. Also, if we capture 'Earth Science Data Categories' on the submission form, this may obviate the need for some keywords. Let's do some better analysis of the QDEX Reports keywords and look at their usefulness. Let's decide on what is suitable as a keyword and what should be core metadata for the report. | Keywords are definitely used when searching The keywords are an opportunity to tag the report with broad content categories that are more granular than the report type. For example, I don’t want to search for all well completion reports, I want to find the ones with core logs, I would use the keywords. Challenge is sorting out how to apply relevant keywords without having to read through > 100,000 reports… | Requires more discussion. Keywords may describe occurrences of material NOT captured as the commodity, so there is some use. But searching for reports with core (as an example) should directly look up a register of core rather than via a keyword. And within that register there should be a flag for whether that core has been geologically logged. |
Tenure | Queensland Mining Permit | yes | No. There's official language regarding this that we should tow the line on. | Can you please provide the document that defines "There's official language regarding this that we should tow the line on." When I spoke with Jodie Hendey, she said that they tried to bring in "Resource Authority" but it never stuck. She said to use "permit". Years ago, they dropped "tenure" as it implied ownership of the land (at the time of contention regarding CSG and farm land access). To my knowledge, there is no collective term in the legislation, they refer to the specific permits, leases, authorities, etc. | Isn't Queensland Mining Permit only one type of tenure? Or have they tried to make it one type of tenure? Will need translations between EPP, A-P, exploration permit, mining permit, mining tenure, mining lease, EPM, MDL,et al. There are or should already be a vocab we can use adapted from the current classification system. The reason that terminology didn’t stick is that the project didn’t go ahead (for a variety of reasons). P&G companies definitely still use tenure. Personally not sure if minerals and coal do, but I wouldn’t be surprised. Permit is ok, but you’ll need to be able to translate it for use. Also see ATP, EPP, A-P, ML, PL, MDL, EPC, EPM, … I don’t like the use of ‘mining permit’, as it is restrictive to certain sections of industry as well as really only referring to one (maybe two) types of tenure – the ML and the MDL. | Opinion - change to just 'Permit' |
Tectonic | Not required - can be derived using spatial intersect | yes | yes | Can be implied from spaitial intersect, but spatial intersect doesn’t deal with depth relationship | Spatial x (Age or Strat) should provide Tectonic in most cases | |
Stratigraphy | Not required - can be derived using spatial intersect | No. Not necessarily a spatial intersect. But should be captured in report data templates | Not necessarily spatial, at time of report submission they may in fact be defining new strat, or redefining old strat, with historical reports they refer to strat which has in fact been 'moved' since writing. | So, will this be a vocabulary? or is there already a trusted source for this data (e.g. in DMEGeo?). What is the cardinality between report and stratigraphy? Does there need to be functionality for the user to create new stratigraphy? Or is this controlled? | Stratigraphy can't be derived from spatial intersect, but could be gathered from other parts of a report going forward. Stratigraphy - so, will this be a vocabulary? or is there already a trusted source for this data (e.g. in DMEGeo?). A report will contain the section of stratigraphy that it has intersected. Stratigraphic units should be as per the ASUD (which I think was being used for the vocab?), but there should probably be some aspect of a company able to propose new units, or report on subdivisions of a unit (for petroleum wells, a particular section of reservoir within a formation, for coal – I’m not sure if the coal seams are listed as official stratigraphic units. So, should be controlled, but with a function for a user to propose new stratigraphy. | Stratigraphy is a one-to-many. One report may have as many stratigraphic units as necessary. I would keep stratigraphy as only ASUD formal units. Another layer down may include reservoir/target/marker units that may be informal or locality/project specific names. |
Age | Not required - can be derived using spatial intersect | No. but can be linked to via stratigraphy | Inherited from strat/basin | Can be derived from stratigraphy and should be associated with stratigraphy - keep in mind this could be a very wide range | Derivable from strat, geochron etc. | |
Date of Report | time:ProperInterval | yes | yes | yes | YES | |
Date of Receipt | System recorded dct:created | yes | yes | yes | YES | |
Project Names | Not recorded | Probably useful data. Should be derivable from report content | Highly relevant for minerals | to my understanding, there is no controlled list of project names (the Coal Hub manages their own list of project names). So, will this be free text? How do we get data integrity? Ideally, we would have linkage between projects and their permits. | Well names important, not sure about seismic name, should be using it for minerals and coal and potentially for some of the CSG fields where naming conventions have shifted over time (e.g. Fairview to FV) No reason why we can’t adopt the coal hub projects. Can see a use for this in grouping coal seam gas projects where well names have changed over time (e.g. Fairview to FV) | Not sure how minerals works but I the impression is that it is important information Coal Hub appears to have a list, can we help them vocabularise this list? Petroleum I would derive from something like field/prospect/lead name as that is typically close to the intent here. e.g. Fairview, Arcadia, Spring Gully, Combabula, etc. |
Mines/Prospect Names | Feature of Interest | yes? maybe? | I guess? We could probably roll up Project Names and Well Names into this tag, but it feels a bit grab-bag-y | is this then two separate metadata fields? Are these controlled lists? I would have thought so for mines, but new prospect names could be created? | By Nick's explanation this is defintiely not a Feature of Interest by our defintion. FOI would be a basin, or rock unit, etc. I'm not sure the intent of this field and what is trying to be described. | |
Well Names | Not recorded | link to GSQ Borehole Profile via Spatial intersect | Very relevant to non-industry reports, some industry reports | is this for P&G only? Will these well names already be in our borehole register? If not, the user would free-text the entry. But I would think we would want to have integrity of the well - entering in the core metadata in our borehole data model. For P&G reports, the report itself will contain the well names. | This is the major link point to the well at the moment (is used for the QDEX Data to QDEX reports link to the best of my knowledge. Currently useful as it is as close to searching by UWI as you can get in QDEX Reports, as the title is a string search and suffers from inconsistent naming. Nope. Also strat, with potential for other commodities against sampling reports. They should already be in the borehole register. And yes, the report will contain the well name, but this is the report metadata and the well should be attributed to the report. | Reports should be linked should be directly with the well and borehole entities relevant to them. If this IS to be included well names should be selectable from a list and NEVER input as free text. Free texting well names never ends well. |
Seismic Survey Names | GSQ Survey Profile | yes | yes | What is the survey profile? | See above. If a seismic survey is relevant to a report it should have a direct and explicit link to the actual survey. |