Closed mwengren closed 3 years ago
Differentiation between resource types for ERDDAP appears to be done using the URL for most endpoints. There is an ERDDAP-griddap in the logic, but it doesn't look like it's being reached.
Is there some good ISO metadata that could be used to test the griddap case?
https://data.ioos.us/dataset/wavewatch-iii-ww3-mariana-regional-wave-model
If you click the link above and navigate to the ERDDAP resource, it actually is an ERDDAP griddap endpoint. Looking at the metadata, there's "ERDDAP-griddap" in there as well. I'll have to do a one off harvest run of this dataset and see what's occurring in the code.
@benjwadams Yes, PacIOOS has a number of these. But they all appear as 'res_format=ERDDAP' in API queries.
Here's another example from PacIOOS, but I think they're pretty consistent: https://data.ioos.us/dataset/surface-currents-from-a-diagnostic-model-scud-pacific
Source record: https://registry.ioos.us/waf/PacIOOS/a9c2ee3bb3da2bfef898f295b29b6386966a81bd.xml
@benjwadams Both the PacIOOS dataset and NERACOOS datasets have similar values for griddap services in the <gmd:protocol>
tags. So we should be able to match for that similarly to the way the code currently looks for ERDDAP:tabledap
.
NERACOOS (https://data.ioos.us/dataset/bio-ww-iii-latest-forecasts-east-coast0e806) uses the default ERDDAP ISO metadata, whereas PacIOOS (https://data.ioos.us/dataset/surface-currents-from-a-diagnostic-model-scud-pacific) makes their own, but they both have identical elements like the following that allow to distinguish griddap services:
<gmd:protocol>
<gco:CharacterString>ERDDAP:griddap</gco:CharacterString>
</gmd:protocol>
I think we can simplify the whole block of code you linked to above to the following:
if resource['resource_locator_protocol'] == 'OPeNDAP:OPeNDAP':
resource['format'] = 'OPeNDAP'
if resource['resource_locator_protocol'] == 'ERDDAP:tabledap':
resource['format'] = 'ERDDAP-TableDAP'
if resource['resource_locator_protocol'] == 'ERDDAP:griddap':
resource['format'] = 'ERDDAP-GridDAP'
As it is currently, it's too greedy in classifying ERDDAP OPeNDAP:OPeNDAP
resources as format: ERDDAP
rather than OPeNDAP
.
It should work better this way I think, but we'll need to do some testing on the dev Catalog instance first though to view changes.
Implemented in https://github.com/ioos/catalog-ckan/commit/ae3f87910b07432ba6dd961b8b52c100c510d18e . Going to see how this works on staging and then deploy to production.
@benjwadams I think we will want to wipe all of these lines out as well:
These are too greedy about labeling things as ERDDAP
or ERDDAP-TableDAP
when in fact for ERDDAP-generated ISO records, they should actually just be labeled OPeNDAP
instead.
This NERACOOS record is a good example:
https://registry.ioos.us/waf/NERACOOS/WW3_EastCoast_latest_iso19115.xml
Both of the relevant OnlineResource elements actually point to the exact same URL, but one is labeled OPeNDAP:OPeNDAP
and the other is ERDDAP:griddap
in the <gmd:protocol>
elements. Your code will label the first ERDDAP
and the second ERDDAP-GridDAP
. Can you take a look?
Added in feaef0364f8598ff3c8b549a14a47b369d0f1ad0
Implemented
IOOS Catalog includes filters for formats of type
ERDDAP-TableDAP
andERDDAP
currently.We should add the ability to parse the equivalent
ERDDAP-GridDAP
format.ERDDAP datasets include metadata with CI_OnlineResource elements of
gmd:protocol=ERDDAP:tabledap
, for example from PacIOOS:https://data.ioos.us/dataset/aloha-cabled-observatory-aco-acoustic-doppler-current-profiler-adcp-velocity, corresponding to ISO XML file:
https://registry.ioos.us/waf/PacIOOS/da9a05ec60da11fc782909557ff5f926a73a14d6.xml
Similarly, this NERACOOS record has the equivalent
gmd:protocol=ERDDAP:griddap
:https://data.ioos.us/dataset/bio-ww-iii-latest-forecasts-east-coast0e806 and XML:
https://registry.ioos.us/waf/NERACOOS/WW3_EastCoast_latest_iso19115.xml
Let's figure out where the code differs in treating each type and add the same parsing for the griddap type.