jacobdale211 / raster_import

0 stars 0 forks source link

Practice #1

Open jacobdale211 opened 2 years ago

jacobdale211 commented 2 years ago

Hey David,

I'm a bit confused with the logic for reading in the 'data/dat.csv' for the puffin and razorbill data. Do you think you could look at my code in 'practice' and let me know what I need to fix?

Thanks!

david-beauchesne commented 2 years ago

Hey Jacob!

So the problem you are experiencing is when you are downloading the resources from open gouvernment using the rgovcan package. You need to specify a path to a folder, not a file, because it'll load multiple different file that are available on the resource website.

If I comment the code a bit:

if (!file.exists('data')) dir.create('data') lookes whether the folder data exists in your current working directory and creates it if it does not exist.

id <- "f612e2b4-5c67-46dc-9a84-1154c649ab4e" is the unique identifier of a dataset available on the federal open government portal

govcan_dl_resources(id, path = 'data/dat.csv') uses the rgovcan package to download content for the aforementioned dataset on open government and loads the files in the folder specified in the path argument. You should replace data/dat.csv for data/ only.

dat <- read.csv(path = "data/dat.csv", exdir = 'data') should read the file dat.csv in the folder data/. No need to specify exdir if you provide the whole path in the first argument. Also, the first argument should be file, not path. Type ?read.csv in the console of R Studio if you want to see a function's documentation.

jacobdale211 commented 2 years ago

Okay, I think I'm just a bit confused by 'data/dat.csv'. I assumed that this was the specific file in the database that I was trying to load.

Essentially, I am supposed to unzip the atlas, and search through for the correct datasets?

david-beauchesne commented 2 years ago

It depends which line of code you are referring to. The function govcan_dl_resources() downloads data from the federal open government portal. The path needed for that function is a folder indicating where the data should be downloaded to.

The function read.csv() imports a csv in R, and that one requires you to indicate the path to the file you wish to import, and that file needs to exist; with the code you sent me, it does not exist.

In the case of the data you are trying to download, there is indeed a .zip file that needs to be unzipped and it likely contains a shapefile. You can do this manually, or you can use the function unzip to do this. Type ?unzip for the documentation on how to use it.

On that point, you should start becoming familiar with R documentation, as it will help you in the long run. Package documentation, at least for the most used packages, is usually very good and well designed. There is usually an example that is fully reproducible, meaning that you can usually simply paste the code of the example in your R console and it should work, so it's often a good way to explore the behaviour of a function to better understand how it works.

jacobdale211 commented 2 years ago

@david-beauchesne Looking at the atlas itself, I'm a bit confused as to what I should be looking for. For example, when I manually look through the unzipped atlas folder on my computer, I don't see anything regarding puffins or razorbill, or any .shp files. I think I'm mostly stuck/confused at the "accessing the data" part of the lesson.

I'm pretty sure it's a rather simple thing that I'm confused with.

david-beauchesne commented 2 years ago

@jacobdale211 the atlas is what's called a geodatabase, i.e. a sort of spatial object that can contain multiple data layers. You can look at the data layers available using sf::st_layers("output/AtlasGrid-GrilleAtlas.gdb") (assuming that's the path to the gbd file of course. Then you can import layers individually in R using sf::st_read(), so something like this:

atlas <- st_read('output/AtlasGrid-GrilleAtlas.gdb',
                  layer = 'AtlasGrid_GrilleAtlas')

This spatial object is a spatial grid that constitutes the Atlas. If you look at the documentation of the atlas you will get more information. For instance, that spatial object only contains polygons. The data on bird densities and effort is found in the DensityData-DonneesDeDensite.xlsx file that is loaded along with the geodatabase. If you want to look at those data spatially, you need to join it to the spatial grid either with a function like st_join or left_join, which is covered a bit later in the presentation.