Closed gorkang closed 10 months ago
Thanks for these reports @gorkang! And apologies that I did not pay enough attention to the details of dribbles when originally setting up this Google Drive support.
I am very open to you taking a look at that PR, or installing via devtools::install_github("rstudio/pins-r@fix-gdrive-dribble")
and running it through its paces.
Thanks for the prompt response @juliasilge !
A quick question. While trying to make pins work in my google drive, I am finding a few more unrelated bugs. Would you prefer a single issue with various gdrive bugs, one issue per bug, or for me to chill and stop reporting bugs :)
OK, so, using pins-r@fix-gdrive-dribble
uploading multiple pins with the same name now works! 🥇
But, trying to download those pins does not work. When using pin_download()
with or without the version
parameter, R gets stuck with no feedback. If after a while you hit Escape, there is an error about pin_meta()
:
devtools::install_github("rstudio/pins-r@fix-gdrive-dribble")
#> Using github PAT from envvar GITHUB_PAT
#> Skipping install of 'pins' from a github remote, the SHA1 (b1199e30) has not changed since last install.
#> Use `force = TRUE` to force installation
library(pins)
library(googledrive)
# Create board ------------------------------------------------------------
DRIBBLE = googledrive::as_dribble("https://drive.google.com/drive/u/0/folders/1MStG1e73DoRO8rxGG93uRBSQIBfUS6ai")
board_drive = board_gdrive(DRIBBLE, versioned = TRUE, cache = NULL)
# Upload ------------------------------------------------------------------
board_drive |> pins::pin_upload(paths = "~/Downloads/UPLOAD/999.zip", name = "pid_999")
#> Creating new version '20230907T060705Z-db9b5'
board_drive |> pins::pin_upload(paths = "~/Downloads/UPLOAD/999.zip", name = "pid_999")
#> Creating new version '20230907T060714Z-db9b5'
# Check -------------------------------------------------------------------
board_drive |> pin_list()
#> [1] "pid_999"
board_drive |> pin_versions("pid_999")
#> # A tibble: 2 × 3
#> version created hash
#> <chr> <dttm> <chr>
#> 1 20230907T060705Z-db9b5 2023-09-07 07:07:05 db9b5
#> 2 20230907T060714Z-db9b5 2023-09-07 07:07:14 db9b5
# Download ----------------------------------------------------------------
# pin_download hangs forever.
# board_drive |> pins::pin_download(name = "pid_999", version = "20230907T060714Z-db9b5")
# IF after a while I hit Escape, this error appears:
#> Error in `pin_meta()`:
#> ! Can't find version "20230907T060141Z-db9b5"
#> Run `rlang::last_trace()` to see where the error occurred.
#> > rlang::last_trace()
#> <error/pins_pin_version_missing>
#> Error in `pin_meta()`:
#> ! Can't find version "20230907T060141Z-db9b5"
#> ---
#> Backtrace:
#> ▆
#> 1. └─pins::pin_download(board_drive, name = "pid_999", version = "20230907T060141Z-db9b5")
#> 2. ├─pins::pin_fetch(board, name, version = version, ...)
#> 3. └─pins:::pin_fetch.pins_board_gdrive(...)
#> 4. ├─pins::pin_meta(board, name, version = version)
#> 5. └─pins:::pin_meta.pins_board_gdrive(board, name, version = version)
#> Run rlang::last_trace(drop = FALSE) to see 3 hidden frames.
Created on 2023-09-07 with reprex v2.0.2
Digging a bit, it seems it fails here:
if (!pins:::gdrive_file_exists(board, metadata_key)) {
pins:::abort_pin_version_missing(version)
}
Specifically, gdrive_file_exists()
creates a path
path <- fs::path(board$dribble$name, fs::path_dir(name))
that uses to search for the file in Google drive
all_names <- pins:::possibly_drive_ls(path)
But that path seems to be wrong. Below a reprex with an alternative that works:
DRIBBLE = googledrive::as_dribble("https://drive.google.com/drive/u/0/folders/1MStG1e73DoRO8rxGG93uRBSQIBfUS6ai")
board = pins::board_gdrive(DRIBBLE, versioned = TRUE, cache = NULL)
name = "pid_999"
# Code in grive_file_exists()
path <- fs::path(board$dribble$name, fs::path_dir(name)); path
#> pins-testing/.
all_names <- pins:::possibly_drive_ls(path); all_names
#> NULL
# Alternative
path2 = fs::path(board$dribble$name, name); path2
#> pins-testing/pid_999
all_names <- pins:::possibly_drive_ls(path2); all_names
#> # A dribble: 2 × 3
#> name id drive_resource
#> <chr> <drv_id> <list>
#> 1 20230907T060714Z-db9b5 1aw_6DxW1LAx46Bg3-ouQYWDabNZvFf1x <named list [33]>
#> 2 20230907T060705Z-db9b5 1XiZpV-ArWh7h25cTcx4e3eCMuV9Un2-H <named list [33]>
Created on 2023-09-07 with reprex v2.0.2
I definitely appreciate your reports here! Especially as I'm not able to reproduce the problem:
library(pins)
my_dribble <- googledrive::as_dribble("https://drive.google.com/drive/u/1/folders/1GQ-JuG4pT1AK9VLz9UZptnO73H7zDaUe")
#> ! Using an auto-discovered, cached token.
#> To suppress this message, modify your code or options to clearly consent to
#> the use of a cached token.
#> See gargle's "Non-interactive auth" vignette for more details:
#> <https://gargle.r-lib.org/articles/non-interactive-auth.html>
#> ℹ The googledrive package is using a cached token for 'julia.silge@gmail.com'.
board <- board_gdrive(my_dribble)
path <- fs::path_temp("some-letters.txt")
readr::write_lines(sample(LETTERS, size = 20), path)
pin_upload(board, paths = path, name = "really-great-letters")
#> Creating new version '20230907T170517Z-b22cb'
readr::write_lines(sample(LETTERS, size = 20), path)
pin_upload(board, paths = path, name = "really-great-letters")
#> Creating new version '20230907T170525Z-2277a'
all_versions <- pin_versions(board, "really-great-letters")
pin_download(board, "really-great-letters", version = dplyr::last(all_versions$version))
#> [1] "~/Library/Caches/pins/gdrive-0481afde68245f08233b69b85e8c4d11/really-great-letters/20230907T170525Z-2277a/some-letters.txt"
Created on 2023-09-07 with reprex v2.0.2
Let me see what I can do with what you have reported.
If you are getting the abort_pin_version_missing()
error, then it looks like this is what is hanging, where metadata_key
is something like "really-great-letters/20230907T180012Z-06114/data.txt"
:
I'm pretty sure this must be working correctly for you, i.e. you get something like "pins-testing/really-great-letters/20230907T180012Z-06114"
as output:
fs::path(board$dribble$name, fs::path_dir(metadata_key))
I think the problem is like what you ran into before, with doing something with a path rather than a dribble just hanging because your Google Drive is exceptionally full.
OK @gorkang in d724f8b5127c64c8a1b426927ba41b6dde028c72 I changed yet another spot where items were looked up by path, to now use a dribble. Would you mind installing again from the same PR and trying your problematic workflow?
Hi @juliasilge . Thanks again for checking this out.
So, we move to the next boss. This time pin_meta.pins_board_gdrive()
It gets stuck when trying to download the meta-data file data.txt gdrive_download(board, metadata_key)
.
Inside gdrive_download
, the specific call getting stuck is: googledrive::drive_download(key, path)
.
The path is OK. The key is pid_10/20230908T071603Z-db9b5/data.txt
so I tried to build a more complete key: pins-testing/pid_10/20230908T071603Z-db9b5/data.txt
, adding the pin board name, and "root folder" in my google drive. But that did not work either.
Then I tried to create a dribble using the complete key, without luck. That gets stuck too. In the end, I reported this bug in {googledrive}: https://github.com/tidyverse/googledrive/issues/446
It seems I can create dribbles for folders such as "Level1/Level2/", but when I try "Level1/Level2/Level3", it gets stuck, even if Level3 is a file.
The only way I could manage to make it work was sticking to the "2-levels limit" for dribbles and changing googledrive::drive_download(key, path).
with:
DRIBBLE = googledrive::as_dribble(dirname(key))
ID = googledrive::drive_ls(DRIBBLE) |> dplyr::filter(name == basename(key)) |> dplyr::pull(id)
googledrive::drive_download(ID, path)
But that seems a lot just to download the metadata file... hopefully there is a way to get n-level folders/files in {googledrive}.
Not doing a pull request on this, because it changes the gdrive_download()
function.
gdrive_download <- function(board, key) {
path <- fs::path(board$cache, key)
if (!fs::file_exists(path)) {
DRIBBLE = googledrive::as_dribble(dirname(key))
ID = googledrive::drive_ls(DRIBBLE) |> dplyr::filter(name == basename(key)) |> dplyr::pull(id)
googledrive::drive_download(ID, path)
fs::file_chmod(path, "u=r")
}
path
}
For the sake of completeness, with the above gdrive_download()
, this works:
library(pins)
library(googledrive)
tictoc::tic()
# Create board ------------------------------------------------------------
DRIBBLE = googledrive::as_dribble("https://drive.google.com/drive/u/0/folders/1MStG1e73DoRO8rxGG93uRBSQIBfUS6ai")
board = board_gdrive(DRIBBLE, versioned = TRUE, cache = NULL)
# Upload ------------------------------------------------------------------
board |> pins::pin_upload(paths = "~/Downloads/UPLOAD/999.zip", name = "pid_14")
#> Creating new version '20230908T074906Z-db9b5'
# Check -------------------------------------------------------------------
board |> pin_list()
#> [1] "pid_14" "pid_12" "pid_10" "pid_3" "pid_X" "pid_999x" "pid_999"
board |> pin_versions("pid_14")
#> # A tibble: 1 × 3
#> version created hash
#> <chr> <dttm> <chr>
#> 1 20230908T074906Z-db9b5 2023-09-08 08:49:06 db9b5
# Download ----------------------------------------------------------------
# Stuck
board |> pins::pin_download(name = "pid_14")
#> [1] "~/.cache/pins/gdrive-b9b939f55bed2fbd2f1a1552eac6f9c0/pid_14/20230908T074906Z-db9b5/999.zip"
tictoc::toc()
#> 18.367 sec elapsed
Created on 2023-09-08 with reprex v2.0.2
Thank you again @gorkang! I updated the function for downloading in 05209834f5d47458098110805ef07eab74cfdd5c, and I checked once more for other uses of a path instead of a dribble. I think that may have been the last? 🤞
Are you able to install again from that same PR to try out the new version?
Thank you! It now seems to work fine! I tried a few things and all is well.
Wonderful! Again, I appreciate your reports of these problems specific to a "well-loved" Google Drive. 😆
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.
Trying to upload a pin a second time or to download a pin both fail with the same error:
After digging a bit, it seems the
pin_versions
method tries to accessboard$dribble$path
, but the path column does not exist in the board.Created on 2023-09-06 with reprex v2.0.2
Session info
``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.3.1 (2023-06-16) #> os Ubuntu 22.04.3 LTS #> system x86_64, linux-gnu #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Atlantic/Canary #> date 2023-09-06 #> pandoc 3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> askpass 1.2.0 2023-09-03 [1] RSPM (R 4.3.0) #> cli 3.6.1 2023-03-23 [1] RSPM #> curl 5.0.2 2023-08-14 [1] RSPM (R 4.3.0) #> digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.1) #> dplyr 1.1.3 2023-09-03 [1] RSPM (R 4.3.0) #> ellipsis 0.3.2 2021-04-29 [1] RSPM #> evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0) #> fansi 1.0.4 2023-01-22 [1] RSPM #> fastmap 1.1.1 2023-02-24 [1] RSPM #> fs 1.6.3 2023-07-20 [1] RSPM (R 4.3.0) #> gargle 1.5.2 2023-07-20 [1] RSPM (R 4.3.0) #> generics 0.1.3 2022-07-05 [1] RSPM #> glue 1.6.2 2022-02-24 [1] RSPM #> googledrive * 2.1.1 2023-06-11 [1] CRAN (R 4.3.0) #> htmltools 0.5.6 2023-08-10 [1] RSPM (R 4.3.0) #> httr 1.4.7 2023-08-15 [1] RSPM (R 4.3.0) #> jsonlite 1.8.7 2023-06-29 [1] RSPM (R 4.3.0) #> knitr 1.43 2023-05-25 [1] RSPM (R 4.3.0) #> lifecycle 1.0.3 2022-10-07 [1] RSPM #> magrittr 2.0.3 2022-03-30 [1] RSPM #> mime 0.12 2021-09-28 [1] RSPM #> openssl 2.1.0 2023-07-15 [1] RSPM (R 4.3.0) #> pillar 1.9.0 2023-03-22 [1] RSPM #> pins * 1.2.1.9000 2023-09-06 [1] Github (rstudio/pins@e70b02e) #> pkgconfig 2.0.3 2019-09-22 [1] RSPM #> purrr 1.0.2 2023-08-10 [1] RSPM (R 4.3.0) #> R.cache 0.16.0 2022-07-21 [1] RSPM #> R.methodsS3 1.8.2 2022-06-13 [1] RSPM #> R.oo 1.25.0 2022-06-12 [1] RSPM #> R.utils 2.12.2 2022-11-11 [1] RSPM #> R6 2.5.1 2021-08-19 [1] RSPM #> rappdirs 0.3.3 2021-01-31 [1] RSPM #> reprex 2.0.2 2022-08-17 [1] RSPM #> rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0) #> rmarkdown 2.24 2023-08-14 [1] RSPM (R 4.3.0) #> rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.1) #> sessioninfo 1.2.2 2021-12-06 [1] RSPM #> styler 1.10.2 2023-08-29 [1] RSPM (R 4.3.0) #> tibble 3.2.1 2023-03-20 [1] RSPM #> tidyselect 1.2.0 2022-10-10 [1] RSPM #> utf8 1.2.3 2023-01-31 [1] RSPM #> vctrs 0.6.3 2023-06-14 [1] RSPM (R 4.3.0) #> withr 2.5.0 2022-03-03 [1] RSPM #> xfun 0.40 2023-08-09 [1] RSPM (R 4.3.0) #> yaml 2.3.7 2023-01-23 [1] RSPM #> #> [1] /home/emrys/R/x86_64-pc-linux-gnu-library/4.3 #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library #> #> ────────────────────────────────────────────────────────────────────────────── ```It seems
pin_versions.pins_board_gdrive
tries to accessboard$dribble$path
but path does not exist.Created on 2023-09-06 with reprex v2.0.2