VLucet / rgovcan

Easy access to the Canadian Open Government Portal
https://vlucet.github.io/rgovcan/
22 stars 4 forks source link

New govcan_dl_resources() methods #10

Closed KevCaz closed 3 years ago

KevCaz commented 3 years ago

Hi @VLucet ,

I took some time to rewrite govcan_dl_resources(), in a nutshell :

Note that documents such as html or wms are not downloaded (and for the moment I honestly think it is for the best). Also, because of the difficulties with "session" in ckan_fetch(), I think it makes more sense to just store files locally, so I always use store = "disk" but this cam always be changed in a future release.

Let me know what do you think but I think with a little more work on this, you will be closed to a nice first release to the CRAN. I can (of course) give you a hand for the doc and to add more tests.

Example with landuse as key word. ```R R> govcan_dl_resources(govcan_search("landuse"), path = "tmp") ℹ Searching the Open Portal for records matching: landuse ℹ CKAN query: 79 records found for keywords: landuse ℹ 79 matching records were found, 10 records were returned Searching for dataset with id: 85b6ef22-d013-52e4-87fd-26bb57899499 ℹ Record found: "Ottawa and Toronto" ℹ Download the English JPG through HTTP (jpg) ⚠ skipped (already downloaded). ℹ Download the English PDF through HTTP (pdf) ⚠ skipped (already downloaded). ℹ Download the French JPG through HTTP (jpg) ⚠ skipped (already downloaded). ℹ Download the French PDF through HTTP (pdf) ⚠ skipped (already downloaded). Searching for dataset with id: 4934cd4e-088f-51ce-84bc-0dba8551248a ℹ Record found: "Quebec City and Montreal" ℹ Download the English JPG through HTTP (jpg) ✔ ℹ Download the English PDF through HTTP (pdf) ✔ ℹ Download the French JPG through HTTP (jpg) ⚠ skipped (already downloaded). ℹ Download the French PDF through HTTP (pdf) ⚠ skipped (already downloaded). Searching for dataset with id: 43675fb6-6510-4d90-80d0-71f7e96b9604 ℹ Record found: "Annual Decay Rates - Prince Edward Island" ℹ Annual Decay Rates - Prince Edward Island (csv) ⚠ skipped (not supported). ℹ Annual Decay Rates - Prince Edward Island - Data Dictionary (csv) ⚠ skipped (not supported). Searching for dataset with id: 60260b59-b81d-47b0-bb80-5ca9d6b5131f ℹ Record found: "Land-use Framework Planning Regions" ℹ Land-use Framework Planning Regions (esri rest) ⚠ skipped (not supported). ℹ Land-use Framework Planning Regions (esri rest) ⚠ skipped (not supported). ℹ Alberta Geoportal (html) ⚠ skipped (not supported). Searching for dataset with id: 2012a482-fd0d-47c3-ba33-35bbb33201dc ℹ Record found: "2M Base Map plus Land-use Framework Planning Regions, Treaty Boundary - Provincial Base Map Series" ℹ Alberta Geoportal (html) ⚠ skipped (not supported). ℹ 2MLUFRegTreatyBdy.zip (other) ✔ Searching for dataset with id: cf7f0363-b899-4d8d-a973-f5b66f4e1fe8 ℹ Record found: "2M Base Map plus Land-use Framework Planning Regions, Municipalities, Green/White Area - Provincial Base Map Series" ℹ Alberta Geoportal (html) ⚠ skipped (not supported). ℹ 2MLUFRegMunicGreenArea.zip (other) ✔ Searching for dataset with id: 9d672147-0584-43b1-b30d-f60db1762a67 ℹ Record found: "750K Base Map plus Land-use Framework Regions / Green and White Areas - Provincial Base Map Series" ℹ Alberta Geoportal (html) ⚠ skipped (not supported). ℹ 750kLUFRegGreenArea.zip (other) ✔ Searching for dataset with id: 39c973f4-808f-4fa5-b555-4a97ce039050 ℹ Record found: "2M Base Map plus Land-use Framework Planning Regions, Green/White - Provincial Base Map Series" ℹ Alberta Geoportal (html) ⚠ skipped (not supported). ℹ 2MLUFRegGreenArea.zip (other) ✔ Searching for dataset with id: b7ca71fa-6265-46e7-a73c-344ded9212b0 ℹ Record found: "Legal Planning Objectives - Current - Point" ℹ KML Network Link (kml) ✔ ℹ Legal Planning Objectives - Current - Point (wms) ⚠ skipped (not supported). ℹ Legal Planning Objectives - Current - Point (wms) ⚠ skipped (not supported). ℹ Data Dictionaries for Strategic Land and Resource Plans (other) ⚠ skipped (ftp not supported yet). ℹ BC Geographic Warehouse Custom Download (other) ⚠ skipped (not supported). ℹ British Columbia Geoportal (html) ⚠ skipped (not supported). Searching for dataset with id: 5d859a89-f173-4006-82f9-16254de2c1fc ℹ Record found: "Non Legal Planning Features - Current - Polygon" ℹ KML Network Link (kml) ✔ ℹ Non Legal Planning Features - Current - Polygon (wms) ⚠ skipped (not supported). ℹ Non Legal Planning Features - Current - Polygon (wms) ⚠ skipped (not supported). ℹ Data Dictionaries for Strategic Land and Resource Plans (other) ⚠ skipped (ftp not supported yet). ℹ BC Geographic Warehouse Custom Download (other) ⚠ skipped (not supported). ℹ British Columbia Geoportal (html) ⚠ skipped (not supported). # A tibble: 33 x 7 id package_id url path fmt store data 1 11fb314a-8d59… 85b6ef22-d013-52e… http://ftp.geogratis.gc.c… tmp/1… jpg disk NA 2 5de82de5-6490… 85b6ef22-d013-52e… http://ftp.geogratis.gc.c… tmp/1… pdf disk NA 3 e92d5f8f-b7bc… 85b6ef22-d013-52e… http://ftp.geogratis.gc.c… tmp/1… jpg disk NA 4 40dbfff0-ca19… 85b6ef22-d013-52e… http://ftp.geogratis.gc.c… tmp/1… pdf disk NA 5 a24951eb-bb56… 4934cd4e-088f-51c… http://ftp.geogratis.gc.c… tmp/1… jpg disk NA 6 7ea56d7f-4996… 4934cd4e-088f-51c… http://ftp.geogratis.gc.c… tmp/1… pdf disk NA 7 4f6cfb27-a5ca… 4934cd4e-088f-51c… http://ftp.geogratis.gc.c… tmp/1… jpg disk NA 8 abb1013b-ee08… 4934cd4e-088f-51c… http://ftp.geogratis.gc.c… tmp/1… pdf disk NA 9 0c1cc3d1-29c8… 43675fb6-6510-4d9… https://124gc.sharepoint.… NA csv NA NA 10 689c67bf-e3dc… 43675fb6-6510-4d9… https://124gc.sharepoint.… NA csv NA NA # … with 23 more rows ```
KevCaz commented 3 years ago

Some issue on window for a specific test, I'm investigating (kind of harder without a Windows machine though),

KevCaz commented 3 years ago

I had a lot of trouble with windows, turn out part of the problem was with the virtual environment.... see https://github.com/actions/virtual-environments/issues/712, the second solution solves this!

campersau commented 3 years ago

@KevCaz To avoid all these set-env warnings you can use the new syntax echo "TEMP=$env:USERPROFILE\AppData\Local\Temp" >> $env:GITHUB_ENV

KevCaz commented 3 years ago

Yep, that's what I ended up doing!

VLucet commented 3 years ago

This looks absolutely awesome. Will review this on Monday.

VLucet commented 3 years ago

Also, I agree about what you said for downloads and html files.