bcgov / bcdata

An R package for searching & retrieving data from the B.C. Data Catalogue
https://bcgov.github.io/bcdata
Apache License 2.0
82 stars 12 forks source link

bcdc_get_data() & .zip resources? #84

Closed stephhazlitt closed 5 years ago

stephhazlitt commented 5 years ago

Do we want to add functionality for .zip files? Or have a more informative error message?

library(bcdata)
#> 
#> Attaching package: 'bcdata'
#> The following object is masked from 'package:stats':
#> 
#>     filter

#community-health-service-areas-chsa
bcdc_get_record("68f2f577-28a7-46b4-bca9-7e9770f2f357")
#> B.C. Data Catalogue Record:
#>     Community Health Service Areas - CHSA 
#> 
#> Name: community-health-service-areas-chsa (ID: 68f2f577-28a7-46b4-bca9-7e9770f2f357 )
#> Permalink: https://catalogue.data.gov.bc.ca/dataset/68f2f577-28a7-46b4-bca9-7e9770f2f357
#> Sector: Health and Safety
#> Licence: Open Government Licence - British Columbia
#> Type: Geographic
#> Last Updated: 2019-04-01 
#> 
#> Description:
#>     Community Health Service Area (CHSA) boundaries; 2018 boundary configuration. 

 On
#>     April 1, 2019, the Ministry of Health released a new geography classification that
#>     introduces community-level geographies nested within the Local Health Areas. 

 The
#>     CHSAs are a mutually exclusive and exhaustive classification of the land area in BC.
#>     CHSAs are contiguous (land area is geographically adjacent) and fit within the
#>     existing geographical hierarchy (cannot violate higher-level geography boundaries
#>     such as the Local Health Area). 
#> 
#> Resources: ( 1 )
#> 1) Community Health Service Areas (zipped shp)
#>      format: zip 
#>      url: https://catalogue.data.gov.bc.ca/dataset/68f2f577-28a7-46b4-bca9-7e9770f2f357/resource/f89f99b0-ca68-41e2-afc4-63fdc0edb666/download/chsa_2018.zip 
#>      resource: f89f99b0-ca68-41e2-afc4-63fdc0edb666 
#>      available in R via bcdata:  FALSE

bcdc_get_data("68f2f577-28a7-46b4-bca9-7e9770f2f357")
#> Error: There are no resources that bcdata can download from this record

Created on 2019-06-11 by the reprex package (v0.3.0)

stephhazlitt commented 5 years ago

OK, I see we don't 🤦‍♀ . Great print method. There are a number of .zip resources though, is this worth leaving open as an enhancement @boshek @ateucher ?

library(bcdata)
#> 
#> Attaching package: 'bcdata'
#> The following object is masked from 'package:stats':
#> 
#>     filter
bcdc_get_record("north-cowichan-parks")
#> Warning: It is advised to use the permanent id ('c9f0be75-81d1-4a8b-b463-09afe46e03b2') rather than the name of the record ('north-cowichan-parks') to guard against future name changes.
#> B.C. Data Catalogue Record:
#>     North Cowichan Parks 
#> 
#> Name: north-cowichan-parks (ID: c9f0be75-81d1-4a8b-b463-09afe46e03b2 )
#> Permalink: https://catalogue.data.gov.bc.ca/dataset/c9f0be75-81d1-4a8b-b463-09afe46e03b2
#> Sector: Service
#> Licence: Open Government Licence – Municipality of North Cowichan
#> Type: Dataset
#> Last Updated: 2018-11-26 
#> 
#> Description:
#>     Parks contains Municipal forestry recreation areas, North Cowichan parks, water
#>     access points, and Non-North Cowichan recreation within municipal boundary. A full
#>     list of areas and facilities can be found at the Parks and Recreation Department on
#>     the North Cowichan website
#>     (https://www.northcowichan.ca/EN/main/departments/parks-and-recreation.html).

#>     This data may only be used under the terms of the [Open Government License -
#>     Municipality of North Cowichan](http://www.northcowichan.ca/ogl).  You are
#>     encouraged to contact the data custodian if you have any questions regarding fitness
#>     for use.

 Additional datasets, including this one, are also available via the
#>     [North Cowichan Open Data
#>     Package](https://catalogue.data.gov.bc.ca/dataset/eee9e339-97ac-4550-ba96-0dc8fd5117b9).
#>     
 Data is refreshed daily Monday to Friday. 
#> 
#> Resources: ( 6 )
#> 1) Recreation KMZ
#>      format: kmz 
#>      url: https://s3-us-west-2.amazonaws.com/openfiles.northcowichan.ca/GIS/Parks/Recreation.kmz 
#>      resource: 69c98803-1be2-4bb9-9aba-a13564b9177a 
#>      available in R via bcdata:  FALSE 
#> 
#> 2) Recreation CSV
#>      format: zip 
#>      url: https://s3-us-west-2.amazonaws.com/openfiles.northcowichan.ca/GIS/Parks/Recreation_CSV.zip 
#>      resource: 7eb9587f-3fa5-4098-a80a-9818cc180e10 
#>      available in R via bcdata:  FALSE 
#> 
#> 3) Recreation DWG
#>      format: zip 
#>      url: https://s3-us-west-2.amazonaws.com/openfiles.northcowichan.ca/GIS/Parks/Recreation_DWG.zip 
#>      resource: 9ea704bf-cd71-4a19-a542-9bbdf3f1cf13 
#>      available in R via bcdata:  FALSE 
#> 
#> 4) Recreation FGDB
#>      format: zip 
#>      url: https://s3-us-west-2.amazonaws.com/openfiles.northcowichan.ca/GIS/Parks/Recreation_FGDB.zip 
#>      resource: 20447c0c-2f76-4b01-9907-6c9c0cd05e5b 
#>      available in R via bcdata:  FALSE 
#> 
#> 5) Recreation GeoPackage
#>      format: zip 
#>      url: https://s3-us-west-2.amazonaws.com/openfiles.northcowichan.ca/GIS/Parks/Recreation_GEO.zip 
#>      resource: d6e6bb6e-be3b-44d4-9d2b-8b9bfe4b52dc 
#>      available in R via bcdata:  FALSE 
#> 
#> 6) Recreation SHP
#>      format: zip 
#>      url: https://s3-us-west-2.amazonaws.com/openfiles.northcowichan.ca/GIS/Parks/Recreation_SHP.zip 
#>      resource: 706281a2-ea3e-49ff-8304-b49c771b3914 
#>      available in R via bcdata:  FALSE

Created on 2019-06-11 by the reprex package (v0.3.0)

stephhazlitt commented 5 years ago

I was thinking maybe R.utils::gunzip() might be useful.

ateucher commented 5 years ago

I think it's definitely worth keeping as an enhancement. We should support some common cases where we can reliable detect the file format (especially .shp, as they will always be zipped due to being made up of multiple files). But there will definitely be cases where it will fail (e.g., strange file types we don't support, multiple files zipped up in one file .zip file, etc.)

stephhazlitt commented 5 years ago

I was/am surprised by how many zipped files I encountered poking around for examples for the vignette.

stephhazlitt commented 5 years ago

And in the meantime, we should probably add in more documentation re: that zip not supported --- I'll add to the README and Get started vignette.