tidyverse / googledrive

Google Drive R API
https://googledrive.tidyverse.org/
Other
321 stars 47 forks source link

Provide workflows for the "missing functions" #123

Open jennybc opened 7 years ago

jennybc commented 7 years ago

There are certain things we don't do right now. And might not ever. These are things that Drive does not directly support but that are possible by composing several operations. We should reveal the workflow, using the functions googledrive has, in an article:

Update: many of these are less interesting now that drive_ls() supports recursion. But will leave this open as a place to collect workflows to incorporate into the website.

jennybc commented 7 years ago

Fodder for a permissions workflow.

This snippet produces a list with one tibble per file, with one row per permission. The variables are

It seems like a good start for a workflow (and ... one day function?) that helps people delve more deeply into permissions.

drive_auth("jenny-at-rstudio-noncaching-token.rds")
x <- drive_find(corpus = "domain", n_max = 25)
x <- x %>% drive_reveal("permissions")
make_permissions_tibble <- function(pr) {
  if (is.null(pr)) return(NULL)
  perms <- pr$permissions
  tibble::tibble(
    role = purrr::map_chr(perms, "role"),
    type = purrr::map_chr(perms, "type"),
    emailAddress = purrr::map_chr(perms, "emailAddress", .default = NA_character_),
    displayName = purrr::map_chr(perms, "displayName", .default = NA_character_),
    domain = purrr::map_chr(perms, "domain", .default = NA_character_ ),
    id = purrr::map_chr(perms, "id", .default = NA_character_)
  )
}
purrr::map(x$permissions_resource, make_permissions_tibble)

When I do this with my RStudio account, I can see good examples of files where I'm not allowed to get permissions (the NULL case above) and others with a nice mix of user, group, and domain grants.

MilesMcBain commented 6 years ago

I have a personal implementation of drive_download_dir() that would address the first bullet point. The main issue is it uses unchecked recursion which could theoretically cause stack overflow for incredibly nested folder structures. :weary:

https://github.com/MilesMcBain/mmmisc/blob/master/R/uitls.R#L210

jennybc commented 6 years ago

googledrive also gained the ability to recursively list a folder via drive_ls() since we last wrote anything here. That is helpful, but doesn't completely get any of these jobs done, of course.

stapial commented 5 years ago

@MilesMcBain I have been looking for a way to "drive_download" an entire google drive folder and I haven't been able to, the link you uploaded seemed to have fixed it but the link is broken. I was ale to identify them using drive_ls but I'm not sure how to download the files after identifying them. Any leads @jennybc ?

MilesMcBain commented 5 years ago

Quite strange re the link. Try this one: https://github.com/MilesMcBain/mmmisc/blob/master/R/utils.R

On Sat, 29 Dec. 2018, 8:05 am Sebastian Tapia <notifications@github.com wrote:

@MilesMcBain https://github.com/MilesMcBain I have been looking for a way to "drive_download" an entire google drive folder and I haven't been able to, the link you uploaded seemed to have fixed it but the link is broken. I was ale to identify them using drive_ls but I'm not sure how to download the files after identifying them. Any leads @jennybc https://github.com/jennybc ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tidyverse/googledrive/issues/123#issuecomment-450434671, or mute the thread https://github.com/notifications/unsubscribe-auth/AJiIOutwBmFmx4QEx8rb7KFUCtJ3C04Mks5u9pWogaJpZM4OLVSG .

stapial commented 5 years ago

that one worked @MilesMcBain, thank you so much! I will look into it after new year.

Happy holidays!

benmarwick commented 5 years ago

Here's a pattern for downloading multiple files from one folder on a googledrive that might be useful to include in https://googledrive.tidyverse.org/articles/articles/multiple-files.html

library(googledrive)
library(purrr)

# here is the ID of our g-drive data folder that contains all the files want to download
data_folder_on_googl_drv_url <- "xxxx"

# get the IDs for each file in our g-drive folder
data_file_ids_on_googl_drv <- 
  drive_ls(as_id(data_folder_on_googl_drv_url))

# download them to our local folder
pwalk(data_file_ids_on_googl_drv,
    ~drive_download(as_id(..2), # ..2 refers to column 2 for the ID
                    # puts the files in /data/raw-data using the same file names
                    # that we see on g-drive,  ..1 refers to column 1 of our dribble
                    # where the file name is stored
                    path = here("data", "raw-data", ..1), 
                    overwrite = TRUE))

What do you think?

There must be another way to iterate using id and name instead of the more cryptic ..1 and ..2

jennybc commented 5 years ago

I just did it this way:

library(googledrive)
library(tidyverse)

url <- "YOUR_FOLDER_URL_GOES_HERE"
x <- url %>% 
  as_id() %>% 
  drive_get()
x

ids <- x %>% 
  drive_ls()

walk2(ids$id, ids$name, ~ drive_download(as_id(.x), .y))
jennybc commented 4 years ago

Another snippet for looking at who has what role on a set of files.

library(tidyverse)
library(googledrive)
library(googlesheets4)
#> 
#> Attaching package: 'googlesheets4'
#> The following objects are masked from 'package:googledrive':
#> 
#>     request_generate, request_make

googlesheets4:::sheets_auth_docs()
#> [1] "googlesheets4-docs@gargle-169921.iam.gserviceaccount.com"
#> Logged in as:
#>   *  displayName: googlesheets4-docs@gargle-169921.iam.gserviceaccount.com
#>   * emailAddress: googlesheets4-docs@gargle-169921.iam.gserviceaccount.com

x <- sheets_examples() %>%
  drive_reveal("permissions")

x %>% 
  select(name, file_id = id, permissions_resource) %>% 
  hoist(permissions_resource, permissions = "permissions") %>% 
  select(-permissions_resource) %>% 
  unnest_longer(permissions) %>% 
  unnest_wider(permissions)
#> # A tibble: 18 x 11
#>    name  file_id kind  id    type  emailAddress role  displayName photoLink
#>    <chr> <chr>   <chr> <chr> <chr> <chr>        <chr> <chr>       <chr>    
#>  1 gapm… 1U6Cf_… driv… 0868… user  jenny@rstud… writ… Jenny Bryan https://…
#>  2 gapm… 1U6Cf_… driv… anyo… anyo… <NA>         read… <NA>        <NA>     
#>  3 gapm… 1U6Cf_… driv… 0581… user  googlesheet… owner googleshee… <NA>     
#>  4 mini… 1k94ZV… driv… 0868… user  jenny@rstud… writ… Jenny Bryan https://…
#>  5 mini… 1k94ZV… driv… anyo… anyo… <NA>         read… <NA>        <NA>     
#>  6 mini… 1k94ZV… driv… 0581… user  googlesheet… owner googleshee… <NA>     
#>  7 form… 1wPLrW… driv… 0868… user  jenny@rstud… writ… Jenny Bryan https://…
#>  8 form… 1wPLrW… driv… anyo… anyo… <NA>         read… <NA>        <NA>     
#>  9 form… 1wPLrW… driv… 0581… user  googlesheet… owner googleshee… <NA>     
#> 10 cell… 1peJXE… driv… 0868… user  jenny@rstud… writ… Jenny Bryan https://…
#> 11 cell… 1peJXE… driv… anyo… anyo… <NA>         read… <NA>        <NA>     
#> 12 cell… 1peJXE… driv… 0581… user  googlesheet… owner googleshee… <NA>     
#> 13 deat… 1tuYKz… driv… 0868… user  jenny@rstud… writ… Jenny Bryan https://…
#> 14 deat… 1tuYKz… driv… anyo… anyo… <NA>         read… <NA>        <NA>     
#> 15 deat… 1tuYKz… driv… 0581… user  googlesheet… owner googleshee… <NA>     
#> 16 chic… 1ct9t1… driv… 0868… user  jenny@rstud… writ… Jenny Bryan https://…
#> 17 chic… 1ct9t1… driv… anyo… anyo… <NA>         read… <NA>        <NA>     
#> 18 chic… 1ct9t1… driv… 0581… user  googlesheet… owner googleshee… <NA>     
#> # … with 2 more variables: deleted <lgl>, allowFileDiscovery <lgl>

Created on 2019-10-10 by the reprex package (v0.3.0.9000)

jennybc commented 4 years ago

Here's how I downloaded a folder's worth of files for an R advent calendar mentioned in this tweet:

https://kiirstio.wixsite.com/kowen/post/the-25-days-of-christmas-an-r-advent-calendar

library(googledrive)
library(fs)
library(purrr)

# this folder is world-readable so you don't have to auth (but you can)
drive_deauth()

# adjust to where YOU want this to go
local_dir <- "~/rrr/2019-advent-calendar"
dir_create(local_dir)

url <- "https://drive.google.com/drive/folders/1eTu5QFSSGUeBjYrnyUnuXqRz8OXyFLy2"
(x <- url %>% 
    as_id() %>% 
    drive_get())

(files <- drive_ls(x))

walk2(files$id, files$name, ~ drive_download(as_id(.x), path(local_dir, .y)))

# optional and has nothing to do with googledrive
usethis::create_project(local_dir)

Known deficiency: doesn't deal with eventualities, such as subdirectories.

jennybc commented 4 years ago

Yet another rectangling example for looking at permissions.

library(tidyverse)
library(googledrive)
# hidden chunk here with auth a spreadsheet id

ssid %>%
  drive_get() %>%
  drive_reveal("permissions") %>%
  hoist(permissions_resource, "permissions") %>%
  select(!ends_with("_resource")) %>%
  unnest_longer(permissions) %>%
  unnest_wider(permissions, names_sep = "_") %>%
  select(name, id, permissions_type, permissions_emailAddress, permissions_role,
         permissions_displayName)
#> # A tibble: 3 x 6
#>   name  id    permissions_type permissions_ema… permissions_role
#>   <chr> <chr> <chr>            <chr>            <chr>           
#> 1 whol… 1EB-… user             hadley@rstudio.… writer          
#> 2 whol… 1EB-… user             joe@rstudio.com  writer          
#> 3 whol… 1EB-… user             jenny@rstudio.c… owner           
#> # … with 1 more variable: permissions_displayName <chr>

Created on 2020-08-27 by the reprex package (v0.3.0.9001)

kendavidn commented 2 years ago

Thank you all for these wonderful workflows! Might anyone have an implementation for uploading an entire folder?

h-a-graham commented 1 year ago

Just in case this is useful, I've written an alternative option for downloading the contents of a drive folder recursively, maintaining the file strucure from Google drive: https://gist.github.com/h-a-graham/27f3fceca4616cd54809dd3c28b8689b Thanks for the great package!