dickoa / rhdx

R package to interact with the Humanitarian Data Exchange portal - http://dickoa.gitlab.io/rhdx/
Other
26 stars 6 forks source link

Accessing .zip file #7

Open anthony0nguyen opened 4 years ago

anthony0nguyen commented 4 years ago

Hi Ahmadou,

I'm currently trying to access Facebook's mobility data but am having trouble.

dat <- search_datasets("movement-range-maps", rows = 1) %>%
  pluck(1) %>% # select first result from search
  get_resource(1) %>%
  ???

It seems like their data is currently zipped. Is there a way to access the file? Once I have it, I know how to unzip in R and get the data (which is .txt formatted) but I'm not sure how to get the .zip file itself. Thanks!

dickoa commented 4 years ago

Hi @anthony0nguyen

I will see if we can improve the metadata and add zipped tsv format in HDX in order for the package to parse it automatically. In the meantime, you can download and use R to get the data. Here is quick example:

library(rhdx)
library(tidyverse)

set_rhdx_config()

path <- pull_dataset("movement-range-maps") %>%
  get_resource(1) %>%
  download_resource(folder = "~/Downloads/")

files <- unzip(path, list = TRUE, exdir = "~/Downloads")
files
##                            Name    Length                Date
## 1                    README.txt       961 2020-07-31 13:30:00
## 2 movement-range-2020-07-30.txt 227579114 2020-07-31 13:36:00

dat <- read_tsv(file.path("~/Downloads", files$Name[2]))
glimpse(dat)
## Rows: 2,605,478
## Columns: 9
## $ ds                                         <date> 2020-03-01…
## $ country                                    <chr> "AGO", "AGO…
## $ polygon_source                             <chr> "GADM", "GA…
## $ polygon_id                                 <chr> "AGO.10.10_…
## $ polygon_name                               <chr> "Lubango", …
## $ all_day_bing_tiles_visited_relative_change <dbl> -0.02992, 0…
## $ all_day_ratio_single_tile_users            <dbl> 0.18751, 0.…
## $ baseline_name                              <chr> "full_febru…
## $ baseline_type                              <chr> "DAY_OF_WEE…