tidyverse / googledrive

Google Drive R API
https://googledrive.tidyverse.org/
Other
322 stars 47 forks source link

Drive_put not updating CSV in Google Drive - Instead creating duplicate csv's #411

Closed gill2bourke closed 1 year ago

gill2bourke commented 1 year ago

Hi all,

I'm fairly new to R but am having an issue updating a csv file stored in Google Drive and could really do with some help. I am learning so appreciate that I might have a few things incorrect.

I have a file in Dropbox that merged cdv's when they come into the folder and updates a merged_data.csv file.
I'm using cronR scheduler t run the script regularly and all of this works fine.

The merged_data.csv file should then be updated a Google Drive folder.

When the script runs, duplicate files are being created in the Google Drive folder, instead of updating the current file there. I've tried to find a solution but have't had any success.

My script is attached below, any help or advice would be brilliant. Thank you


library(tidyverse)
library(googledrive)

# install cronR 
# install.packages("googledrive")
library(cronR)
library(googledrive)

# force use of a token associated with a specific email
drive_auth(email = "************@gmail.com",use_oob = TRUE)

# Schedule using cronR
#cron_rstudioaddin()

# set working directory
setwd(path.expand("~/Dropbox/Test Folder"))
getwd()

list_of_data <- map(list.files(pattern = "TeamName"), 
                    read_csv, col_types = cols(.default = col_character()))

data <- bind_rows(list_of_data)

write.csv(data, "merged_data.csv", row.names = F, na = "")

# set drive folder
td <- drive_get("https://drive.google.com/drive/u/1/folders/********************************")

x <- drive_upload(media = "merged_data.csv", name = "merged_data_1", type = "spreadsheet", path = as_id(td))

# update the local file
cat("end", file = "merged_data.csv", sep = "\n", append = TRUE, overwrite = TRUE)

# PUT again --> drive_put() delegates to drive_update()
file <- drive_put("merged_data.csv")
jennybc commented 1 year ago
x <- drive_upload(media = "merged_data.csv", name = "merged_data_1", type = "spreadsheet", path = as_id(td))

# ..

# PUT again --> drive_put() delegates to drive_update()
file <- drive_put("merged_data.csv")

What are you intending with these 2 calls (drive_upload(), drive_put())?

FWIW you don't need to upload the csv (with drive_upload()) and then do drive_put(). You can just do the drive_put() directly. Your drive_upload() call is going to create a new file ID every time you execute it.

You should also take a look at the arguments and make sure you're using them as you wish, in terms of which provides media vs. which provides the target filepath. drive_put() determines whether there's an existing file to update based on the filepath. The drive_upload() call and the drive_put() call are writing to different filepaths.

I imagine you're seeing lots of messages go by with more detail about what is happening.

Also, the use_oob = TRUE argument is only relevant when you first obtain a token interactively. In a script that runs non-interactively, such as via a cron job, that is presumably finding and using an existing token, you can omit that from the drive_auth() call.

gill2bourke commented 1 year ago

Hi Jenny

Thank you that was very helpful and has solved the issue.

I am having another problem now, if you could maybe assist also please?

For the files I want to merge, I want to set the working directory to the online Dropbox account, instead of the local drive.

I have authorised the dropbox using


install.packages('rdrop2')
library(rdrop2)
drop_auth()
token <- drop_auth()

Set the working directory in the online account as follows:

# set working directory - if Dropbox online is source of files
outputDir <- "Test Folder" # folder name in Dropbox

Used this to call the list of cdv's with the matching word


list_on_dropbox <- drop_dir(outputDir) %>% 
  filter(str_detect(name, 'TeamName'))

I know I am missing a step here. I have tried to use drop_download to download the files locally to merge them, however it isn't working and I am getting an _Error in dropdownload: Conflict (HTTP 409)

When I tried to bind the above files, it only binds the file data instead of the contents.

dropbox_data <- bind_rows(list_on_dropbox)

write.csv(dropbox_data, "merged_data_dropbox.csv", row.names = F, na = "")

Any help on the missing step would be brilliant.

jennybc commented 1 year ago

I don't completely follow all of the above, especially the rdrop2 parts.

But as for the bind_rows() call, you seem to be passing ?a data.frame listing files on DropBox?. Whereas it expects the actual data frames you want to bind. You have to download the files, read them into R data frames, and pass that to bind_rows().

I think we have answered your questions that relate to googledrive, so I'm closing this.

A good place to ask questions about general R usage is Posit Community: https://community.rstudio.com/