TerriaJS / nationalmap

Australia's NationalMap
https://nationalmap.gov.au
122 stars 45 forks source link

Surfacing more data from catalogue.data.wa.gov.au #848

Open keithmoss opened 5 years ago

keithmoss commented 5 years ago

Hi team,

We're beginning to harvest more third party web GIS services into Western Australia's CKAN and would like to surface them in NationalMap. Historically, you've only been surfacing data from the central SLIP (slip.wa.gov.au) geospatial data services platform.

We've recently harvested a range of spatial services from Main Roads who are hosting data on their own ArcGIS Server instances. There's several other agencies in a similar situation that we're planning to harvest this year, so we'd like to make sure these datasets are going to be able to show up in NationalMap.

To start with, can you remind us how much control we have over the config of the package_search query NatMap sends us[1]?

[1] https://nationalmap.gov.au/proxy/_1d/http://catalogue.beta.data.wa.gov.au/api/3/action/package_search?rows=100000&sort=metadata_created%20asc&start=0&q=data_homepage%3A*_Public_Services*&fq=res_format%3AWMS

Ping @vduong2

kring commented 5 years ago

@keithmoss the query is fairly configurable. Here's the list of properties that can be configured: https://docs.terria.io/guide/connecting-to-data/catalog-type-details/ckan/

And here is how the WA group is configured: https://github.com/TerriaJS/NationalMap-Catalog/blob/master/datasources/includes/WA.ejs

AnaBelgun commented 5 years ago

to investigate: There’s an ‘access_level’ property on our CKAN resources that can be used to filter to only show truly public data, but everything on https://docs.terria.io/guide/connecting-to-data/catalog-type-details/ckan/ indicates that Terria is operating on package/dataset level, not resource level. Is that accurate?

steve9164 commented 5 years ago

@keithmoss We're using CKAN's package_search, which allows querying, filtering and sorting datasets on dataset attributes. From there we add each dataset and it's resources if they match the parameters resource format settings you gave (includeWms, wmsResourceFormat, etc.)

keithamoss commented 5 years ago

@steve9164 Am I right in saying there's nothing in that mix to let us further filter the resources that go into NatMap? That is, once NatMap gets a package in the package_search response it assumes that all of its resources are going to be accessible to the user and should be added?

e.g. https://catalogue.data.wa.gov.au/dataset/bush-forever-areas-2000-dop-071 has 7 resources

So if we could (in theory) further filter the package_search response to the the 2 public resources that are actually relevant to NatMap by looking for resources[*]["access_level"] == "open" we'd be able to avoid presenting the user with what look like duplicates that ask them to login.

steve9164 commented 5 years ago

@keithamoss That is correct. Terria does a bit of resource filtering to try to list all and only the resources it can display but it doesn't have a way to filter by "access_level".

AnaBelgun commented 5 years ago

it might be possible with a q= or fq=, but we're not sure. @keithamoss if you can help us in figuring it out it's easy to make terria do it