Closed florianm closed 8 years ago
python bin/pycsw-admin.py -c post_xml -u http://localhost:8000/pycsw/csw.py -x /usr/lib/ckan/default/src/pycsw/post-slip-classic-rp.xml
post-slip-classic-rp.xml https://gist.github.com/keithmoss/888dcfef3a53ffead18e
fixed "clear harvest source" bug and submitted as https://github.com/ckan/ckanext-harvest/pull/152
Would need to dedupe SLIP Classic WMS/WFS because they're separate services.
See https://www2.landgate.wa.gov.au/web/guest/57;jsessionid=39A845177E1155A4CAA09A7D0A33E450 and https://www2.landgate.wa.gov.au/web/guest/subscription1.
A couple of days work for Florian to create a little custom harvester if there's no nice way of doing it via PyCSW.
incited datacats to provide their own docker image for pycsw for simplicity. https://github.com/datacats/datacats/issues/301
This was the code I commented out to hack around the validation issues (just to get something working)
https://github.com/ckan/ckanext-spatial/blob/master/ckanext/spatial/harvesters/base.py#L474-L482
The simple fix for a preview of course is to show SLIP as one single dataset with a working preview. This requires Keith's magic reverse proxy which provides authentication using an existing account to access the public SLIP WMS.
e.g. on the home page (layout 1): http://catalogue-beta.data.wa.gov.au/ as "featured resource"
or as resource http://catalogue-beta.data.wa.gov.au/dataset/slip-classic/resource/fa1652e0-9782-40d1-b74e-342c367cc8a7
SLIP Classic harvesting works (Python script). Some fields need fine tuning. Will add WMS url as resource.
SLIP Classic harvesting now with added WMS urls as resources. @keithm should this be the proxied URL or this one https://www2.landgate.wa.gov.au/ows/wmspublic?
SLIP Future, ArcGIS REST endpoint harvester: https://github.com/GSA/ckanext-geodatagov
How much work would be required to harvest from multiple SLIP endpoints and dedupe so datasets present with multiple resources?
e.g. LGATE-001 - Cadastre (No Attributes) will be present in wmspublic, wmsCsCadastre, and wfsCsCadastre
Consider how we handle the user experience of accessing a WMS link and getting prompted for auth.
Consider how WMS/WFS preview are done given auth requirement. Is preview desirable or required?
We'll run it once and never again. After that we can edit to our heart's content.
Otherwise it looks like Landgate is the custodian.
Keith to provide words that briefly explain SLIP, talks about access and having to sign up, et cetera.
Remove extraneous information from SLIP harvested layer names.
e.g.
Public Transport Authority Services (Pta-007) (24-08-2015 12:50:09)
should be
Public Transport Authority Services
Additionally:
Keith to supply agency acronym to full name mapping.
May need to be smart and default back to the acronym if not all mappings are available.
http://slip.landgate.wa.gov.au/Pages/Data-Dictionary.aspx
If not: Do by hand?
List of data dictionaries
Description template:
This dataset has been sourced from Landgate's Shared Location Information Platform (SLIP) - the home for Western Australian government geospatial data. Many of the datasets in SLIP are free and publicly available to users who simply [sign up for a SLIP account](https://www2.landgate.wa.gov.au/web/guest/request-registration-type).
Find our more about SLIP at [http://slip.landgate.wa.gov.au/](http://slip.landgate.wa.gov.au/).
{UNIQUE ID HERE}
Suggest no landing page in the case of SLIP, as we already link to it.
TODO
https://www2.landgate.wa.gov.au/ows/wmspublic
https://www2.landgate.wa.gov.au/ows/wfspublic_4326
https://www2.landgate.wa.gov.au/ows/wfsCsAdmin_4283/wfs
https://www2.landgate.wa.gov.au/ows/wmsCsCadastre
https://www2.landgate.wa.gov.au/ows/wfsCsCadastre_4283
https://www2.landgate.wa.gov.au/ows/wmsCsMosaic
https://www2.landgate.wa.gov.au/ows/wfsCsTopo_4283
Current sandbox: http://waitwaitboom.alpha.data.wa.gov.au/ Current "clean demo": http://catalogue.alpha.data.wa.gov.au/
resource title: OGC Web Map Service Learn how to access this resource URL in a GIS (e.g. QGIS or ArcGIS) with your SLIP credentials.
wms/wfs public/cadastre are harvested into alpha, nmap preview on by default, zooms to wms layers and loads them automatically
harvesting to do: virtual mosaic (one layer), wfsCsAdmin (no corresponding WMS), wfsCsTopo(url broken)
documenting to do: once harvesting finished, clean up ipy notebook and publish to alpha and github
SLIP classic is harvested and resources are deduped. Harvesting script is now ready to accept SLIP Future layers.
👍
SLIP classic harvesting is at 90% - there are some layers with exceptional names that are not picked up by the harvesting script. As discussed, let's leave SLIP classic at that stage until further funding is secured for the last 10%.
Suggesting to close, feel free to re-open.
Implement WMS harvesting of SLIP endpoints following Keith's example