Ecotrust / wc-data-registry

Data Registry for the West Coast Governors Alliance
Other
10 stars 8 forks source link

WAF attribute displays badd #83

Closed fishytodd closed 10 years ago

fishytodd commented 10 years ago

the regiserted wafs seem to not be pulling any fields form the metdadata other than titel and abstract. Includes the Federal data. Metadata appears to be complete in geoportal admin.

tchaddad commented 10 years ago

More info on this problem from @emiliom (Dec 18, 2013):

"I've done some digging. Using the Geoportal interface (http://wcgardf.sdsc.edu/geoportal/), I confirmed that the WAF-based metadata records are in fact stored in GeoPortal. I did a spot check on ~ 4 records. For the WCGA Marine Debris database, here's the styled metadata document as presented by Geoportal:

http://wcgardf.sdsc.edu/geoportal/catalog/search/resource/details.page?uuid={C261595B-BB26-421D-A078-9809DC418569}

And the metadata XML file, also served by Geoportal: http://wcgardf.sdsc.edu/geoportal/rest/document?id={C261595B-BB26-421D-A078-9809DC418569}

In contrast, when you go to the same dataset in the WCODP: http://portal.westcoastoceans.org/discover/#?text=marine%20debris

the JSON and Metadata XML links look like this (here's the XML link): http://portal.westcoastoceans.org/geoportal/rest/document?id=http://apps.ecotrust.org/marine_debris/metadata/marine_debris_metadata.xml

which doesn't follow the pattern used for all non-WAF records; here's an example from a different, non-WAF record: http://portal.westcoastoceans.org/geoportal/rest/document?id={AFF22EF6-49D8-4867-B6C6-CD80020AB36B}

Note the form of the request in that link: the id parameter is followed by a globally unique ID string, not the URL for the original WAF metadata record.

It's possible GeoPortal WAF harvesting generates two types of identifiers, and that leads to confusion and errors by the portal code. I've seen some instances where more than one identifier is associated with a catalog record, and don't understand how that happens. But it's also possible that the portal code is doing something different and wrong for WAF records. My guess is that it's the former, but I can't really sort that out myself. Either way, I suspect the fix won't be too onerous."

emiliom commented 10 years ago

To add to this: Here's the UUID for a record that looks and behaves "normal": {BED99236-D51D-443A-AFD6-6164D94A896C} From a CSW request, one identifier (dc:identifier element) is returned, labelled as having scheme="urn:x-esri:specification:ServiceType:ArcIMS:Metadata:DocID"

For WAF-derived records that behave strangely (like the WCGA Marine Debris Database in the previous comments, or others), a CSW request returns two identifiers, of two types: one like the one above, with a UUID in that form; and one like this: scheme="urn:x-esri:specification:ServiceType:ArcIMS:Metadata:FileID" The latter is almost always an http string. For the Marine Debris Database, the portal Metadata XML link leads nowhere (basically an error). For the record "CMSP/Shipping_Lanes", the CSW response also has two identifiers; but the portal Metadata XML link does lead to an XML document (not an error), except it's an RDF/Dublin Core XML, not an FGDC or ISO; plus the record content text is pretty much crap.

Tanya has some additional diagnostics that should be helpful. But basically, it appears that in many of the WAF cases, an appropriate metadata XML (FGDC or ISO) does exist and is accessible from the GeoPortal GUI, but the WCODP portal is scrambling it due to the existence of two identifiers. We're hoping this is a bug in the WCODP code that can be fixed somewhat easily and would give a big payoff right away.

(BTW, I realize the CSW response is not relevant per se, but I use it as an indicator of what's going on, since I have access to it).

Thanks!