Open ben519 opened 4 years ago
It looks like you are using it fine, its just this function is hard for me to test as its when intermittent errors effect your upload. It has an auto-retry method to help with larger uploads of TBs, so its odd you are running into issues at much smaller loads. Perhaps you have a weak connection or a proxy or something that means you see it more often? 10mins to upload 2.4 MB is very long, it takes me seconds to upload similar sizes on around 20MBit internet connection (that makes sense). I guess you have something special about your connection?
Thanks for the prompt reply. I figured it'd be hard one to resolve since you can't exactly debug, but I wanted to log the issue in case others are experiencing the same trouble.
AFAIK there's nothing special about my connection. I've experienced the same issues on many different networks. Perhaps it has something to do with the files. I'll keep tinkering with this. Thanks.
Hi Mark,
I'm having a similar issue with gcs_upload
upload_try <- gcs_upload(file = parsed_download, name = "ft_bucket/trainTestUpload2.csv") 2020-03-25 10:52:35 -- File size detected as 64.6 Mb 2020-03-25 10:52:35 -- Found resumeable upload URL: https://www.googleapis.com/upload/storage/v1/b/my-bucket/o/?uploadType=resumable&name=ft_bucket%2FtrainTestUpload2.csv&predefinedAcl=private&upload_id=AEnB2UqidgMfd84OiQNPmjEaddPFJFIP_OBvMoA2-lneWdmAd4T3z9LydRDHNBuWgdUoFKCrJVepXBlnwTPwcgpDwO2sO7zEnw Warning messages: 1: No JSON content detected 2: In doHttrRequest(req_url, shiny_access_token = shiny_access_token, : API checks failed, returning request without JSON parsing
This uploads and I can see the file via the web interface. If I download it via that I'm provided a CSV which I can read into R.
However if I try to use gcs_get_object...
parsed_download2 <- gcs_get_object("ft_bucket/trainTestUpload2.csv") Downloaded ft_bucket%2FtrainTestUpload2.csv Object parsed to class: raw
I'm unable to use gcs_parse_download() on this object.
I think trying various values of upload limit will get one past this issue: options(googleCloudStorageR.upload_limit = 1000000000L)
For the case where it is a large file for which the uploaded needs to be resumed this makes sense, but my understanding is that sometimes the file is corrupted and you really just want it to not resume but overwrite the upload.
Happy to learn about a better way to resolve this as well.
I keep running into errors when attempting to upload some .rds datasets to google cloud storage. For example, here's one part of a big data pipeline
Oddly, running this a second time without changing anything, works (albeit with warnings)
Another example,
In this case, the upload actually works even though the function errors (although it took about 10 minutes to upload this 2.4Mb file).
Am I using
gcs_upload()
properly? Any advice on how I can make it run smoother? Note that I'm using googleCloudStorageR v 0.5.1. Much appreciated!