Closed mwengren closed 3 years ago
@mwengren does that mean IOOS Metadata Profile v1.2 is official?
@Bobfrat we're still working out the details of the QARTOD and GTS ingest attributes, but otherwise, yes, everything else is finalized. 'infoUrl' is something we changed recently to match ERDDAP's existing infoUrl attribute.
If you want to see the current working version of the 1.2 profile, it is in my own fork here: https://mwengren.github.io/ioos-metadata/ioos-metadata-profile-v1-2.html.
These are the still pending sections:
Hopefully, we'll end up using the attributes in those first two tables (building off of existing CF ones for quality control via 'status_flag'), but we're going to reach out the CF group first to make sure it's in line. Essentially we're adding 'flag_method' to go alongside 'flag_values' and 'flag_meanings'. 'flag_method' will have a vocabulary of QARTOD test names.
Unfortunately not quite ready to add to CC, but close. Please compare against Glider DAC rules when you can.
cc @jessicaaustin @kwilcox
@benjwadams This one has been idling for awhile, but we should pick it back up again.
As a reminder, the purpose here is to query an ERDDAP dataset for particular attributes (global only, I believe) as part of the harvest process with the purpose of adding the values as 'extras' inside CKAN's package.extras table (thereby exposing them as part of the API for clients).
The main use case to keep in mind is:
ioos_ingest
to determine whether to try to harvest a dataset or not). Other attributes such as the global platform
attribute might be useful for labeling as well, but are not included in the ISO XML produced by ERDDAP.The only relevant change to the IOOS Metadata Profile since we first created this is the addition of the ioos_ingest
attribute, which I added to the list at the top of the issue.
@benjwadams
Let's discuss picking this idea back up again at our next meeting. I updated the first entry in this issue with the appropriate list of IOOS attributes to read.
We're going to try to integrate Catalog with the harvesting workflow envisioned for Sensor Map and ERDDAP, and reading these attributes directly from ERDDAP as part of the harvest workflow will be necessary.
Related to issue #227 in that we'll need to properly identify datasets with ERDDAP endpoints.
I've tested this using both the Catalog UI and CKAN API and filtering on custom IOOS attributes works great!
For example, to filter by both gts_ingest=true
and ioos_ingest=true
:
Catalog UI: https://data.ioos.us/dataset?q=gts_ingest%3Atrue+ioos_ingest%3Atrue
CKAN API: https://data.ioos.us/api/3/action/package_search?fq=gts_ingest:true%20ioos_ingest:true&start=0
The actual attribute values for each dataset are all available in the CKAN API JSON results under the extras
field.
Also, the IOOS Metadata Profile 1.2 custom attributes tab at the bottom of the dataset detail page looks quite nice as well. Thanks @benjwadams!
Pinging @jessicaaustin as we should discuss having Axiom test using the CKAN API to filter for ERDDAP datasets to ingest into Sensor Map. We may need to re-address guidance for RAs on ioos_ingest
default value, however, as part of that.
Closing the issue as functionality is implemented, but we can continue discussion here if necessary.
Also, note that you can issue 'not equal to' queries as well, so the example above can be adapted to find cases where gts_ingest = true
and ioos_ingest != false
, which matches our Profile 1.2 guidance for any datasets RAs would like to have included into both the GTS and IOOS products:
UI: https://data.ioos.us/dataset?q=gts_ingest%3Atrue+-ioos_ingest%3Afalse
CKAN API: https://data.ioos.us/api/3/action/package_search?fq=gts_ingest:true%20-ioos_ingest:false&start=0
937 datasets at present.
Noting another query possibility: datasets with gts_ingest=true
, but are lacking a global wmo_platform_code
value per the IOOS Metadata Profile 1.2 guidance:
UI: https://data.ioos.us/dataset?q=gts_ingest%3Atrue+-wmo_platform_code%3A[*+TO+*]
CKAN API: https://data.ioos.us/api/3/action/package_search?fq=gts_ingest%3Atrue+-wmo_platform_code%3A[*+TO+*]
@benjwadams Per meeting this morning, here's a list of important global dataset attributes from the IOOS Metadata Profile 1.2 to parse for use in CKAN (filtering, API access, and encoding in Schema.org JSON LD).
Some of these may already be added to the ISO XML by ERDDAP (and ncISO/THREDDS). Didn't do a crossreference making this list. We may have repeated information if so, but we can ignore that for now.
For Sensor Map/NDBC ingest (only the global attributes, not variable-level equivalents):
Other attributes (all global)
Each of these global attributes can just be stored as an individual 'extra' value in the CKAN database, and therefore should be exposed via CKAN API
package_search
or other endpoints, PacIOOS results for example:https://data.ioos.us/api/3/action/package_search?q=organization:pacioos%20and%20res_format:ERDDAP-TableDAP%20and%20cf_standard_names:sea_water_turbidity%20and%20gcmd_keywords:%22EARTH%20SCIENCE%20%3E%20OCEANS%20%3E%20OCEAN%20CHEMISTRY%20%3E%20OXYGEN%22&start=0