cloudyr / googleCloudStorageR

Google Cloud Storage API to R
https://code.markedmondson.me/googleCloudStorageR
Other
104 stars 29 forks source link

Lexical Error: Invalid Char in JSON Text #4

Closed shariharan99 closed 8 years ago

shariharan99 commented 8 years ago

I downloaded the most recent version of this package yesterday, auto-authenticated, but am getting the following error when trying to get an object from google storage using gcs_get_object:

Request Status Code: 404 Error: lexical error: invalid char in json text. Not Found (right here) ------^

Note that I am able to use the gcs_list_buckets and gcs_list_objects functions fine and they do give me correct information, which means my auth is fine and it's able to enter the bucket to find what I need.

Here is my session Info:

sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.3 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bit64_0.9-5                  bit_1.1-12                   googleCloudStorageR_0.0.9000 googleAuthR_0.3.1           

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.5     dplyr_0.5.0     digest_0.6.9    crayon_1.3.1    assertthat_0.1  R6_2.1.2        bigrquery_0.3.0 jsonlite_1.0   
 [9] DBI_0.4-1       magrittr_1.5    httr_1.2.1      curl_0.9.7      testthat_1.0.2  tools_3.3.0     memoise_1.0.0   openssl_0.9.4  
[17] tibble_1.0 

Here is the code I am running to re-produce the error:

## Automatic Authorization ##
endpoint = httr::oauth_endpoints("google")
secrets = jsonlite::fromJSON("~/workspace/zodiac-metrics.json") #change if necessary
scope = "https://www.googleapis.com/auth/devstorage.full_control"

token = httr::oauth_service_token(
  endpoint = endpoint,
  secrets = secrets,
  scope = scope)

googleAuthR::gar_auth(token)

#Set Project and Bucket
proj <- "my-bucket"
bucket <- "modeling-task"

#These function work fine#
buckets = gcs_list_buckets(proj)
objects = gcs_list_objects(bucket)

#This gets the error
gcs_get_object(proj, objects$name[[973]])

I know you can't obviously get the exact same object, but the file I am trying to get is a .csv. So objects$name[[973]] looks like "model-group/my_data.csv". However it doesn't really matter what type of file I try to get I get the same error.

Also I am auto-authorizing because I will be attempting to run this code in a container at some point, so I can't do the browser thing.

MarkEdmondson1234 commented 8 years ago

One problem may be the mixing up of httr auth and googleAuthR auth, you don't need to construct the httr token as gar_auth_service() will do that for you. csv files should be fine as that's pretty much the reason the library was made in the first place (for BigQuery files)

Please try with this code:

library(googleAuthR)
library(googleCloudStorageR)

## set here as well just in case
options(googleAuthR.scopes.selected = "https://www.googleapis.com/auth/devstorage.full_control")

gar_auth_service("location_of_json_file.json", 
                 scope = "https://www.googleapis.com/auth/devstorage.full_control")

#Set Project and Bucket
proj <- "my-bucket"
bucket <- "modeling-task"

#These function work fine#
buckets = gcs_list_buckets(proj)
objects = gcs_list_objects(bucket)

#This gets the error
gcs_get_object(proj, objects$name[[973]])

Also try with normal OAuth2 (gar_auth() )

Remember the service email needs to be added to the appropriate Google cloud project, that could also be a reason.

If its still an error, you can use options(googleAuthR.verbose = 1) to get more feedback, please post that.

shariharan99 commented 8 years ago

Thanks for this info. I ensured that my service account is added to the Google Cloud project. Trying your above code unfortunately produced the same error. Running verbose = 1 gave me the following feedback on the error function:

Token exists.
Valid local token
Request: https://www.googleapis.com/storage/v1/b/zodiac-metrics/o/b08942b8-d7c9-4580-9c42-ec1e76809c0a/raw_tlog.csv?alt=media
-> GET /storage/v1/b/zodiac-metrics/o/b08942b8-d7c9-4580-9c42-ec1e76809c0a/raw_tlog.csv?alt=media HTTP/1.1
-> Host: www.googleapis.com
-> User-Agent: libcurl/7.43.0 r-curl/0.9.3 httr/1.0.0 googleAuthR/0.1.2 (gzip)
-> Accept: application/json, text/xml, application/xml, */*
-> Accept-Encoding: gzip
-> Authorization: Bearer ya29.CjAmA-ckH2tjAz0LqLWFIPE2aYu2ZiWispdDXDWmx9gRTqXC9fvXcTELJere76mLEcg
-> 
<- HTTP/1.1 404 Not Found
<- Cache-Control: no-cache, no-store, max-age=0, must-revalidate
<- Pragma: no-cache
<- Expires: Mon, 01 Jan 1990 00:00:00 GMT
<- Date: Wed, 20 Jul 2016 21:48:08 GMT
<- Vary: Origin
<- Vary: X-Origin
<- Content-Type: text/html; charset=UTF-8
<- Content-Encoding: gzip
<- X-Content-Type-Options: nosniff
<- X-Frame-Options: SAMEORIGIN
<- X-XSS-Protection: 1; mode=block
<- Server: GSE
<- Alternate-Protocol: 443:quic
<- Alt-Svc: quic=":443"; ma=2592000; v="36,35,34,33,32,31,30,29,28,27,26,25"
<- Transfer-Encoding: chunked
<- 
Request Status Code: 404
Error: lexical error: invalid char in json text.
                                       Not Found
                     (right here) ------^
shariharan99 commented 8 years ago

Here is the result for gcs_list_objects which does work:

Token exists.
Valid local token
Request: https://www.googleapis.com/storage/v1/b/modeling-task/o/
-> GET /storage/v1/b/modeling-task/o/ HTTP/1.1
-> Host: www.googleapis.com
-> User-Agent: libcurl/7.43.0 r-curl/0.9.3 httr/1.0.0 googleAuthR/0.1.2 (gzip)
-> Accept: application/json, text/xml, application/xml, */*
-> Accept-Encoding: gzip
-> Authorization: Bearer ya29.CjAmA4Yy9pQ7-uZzo7TfYGSfqepcBbkhOlkcOOq31U6a67mOaB5SW4_InxB8PepdlJ4
-> 
<- HTTP/1.1 200 OK
<- Expires: Wed, 20 Jul 2016 21:49:48 GMT
<- Date: Wed, 20 Jul 2016 21:49:48 GMT
<- Cache-Control: private, max-age=0, must-revalidate, no-transform
<- Vary: Origin
<- Vary: X-Origin
<- Content-Type: application/json; charset=UTF-8
<- Content-Encoding: gzip
<- X-Content-Type-Options: nosniff
<- X-Frame-Options: SAMEORIGIN
<- X-XSS-Protection: 1; mode=block
<- Server: GSE
<- Alternate-Protocol: 443:quic
<- Alt-Svc: quic=":443"; ma=2592000; v="36,35,34,33,32,31,30,29,28,27,26,25"
<- Transfer-Encoding: chunked

The first difference is the HTTP/1.1 404 not found vs. HTTP/1.1 200 OK

shariharan99 commented 8 years ago

Also I did try with normal OAuth2 and that did not work.

MarkEdmondson1234 commented 8 years ago

It 404s if the file isn't found, I would double check the name is correct going in, auth all looks to be working correctly. At the very least I need better user feedback if the file isn't there.

shariharan99 commented 8 years ago

The file is definitely there, as I am using objects$name[[970]] (as an example) which is pulled from gcs_list_objects. When I try a few other objects I get an error code 400:

Request Status Code: 400
Error: lexical error: invalid char in json text.
                                       <!DOCTYPE html> <html lang=en> 
                     (right here) ------^
shariharan99 commented 8 years ago

I'm going to check what kind of permissions I have.. maybe I only have view only and not read.

MarkEdmondson1234 commented 8 years ago

@shariharan99 did you find out if you had the right permissions? Can I close this or shall I keep it in the pipeline?

MarkEdmondson1234 commented 8 years ago

Marking this closed until I hear otherwise

MarkEdmondson1234 commented 8 years ago

There is a new function that lets you check user_acess, that would help diagnose this: gcs_get_object_access()

MarkEdmondson1234 commented 8 years ago

One thing I just came across that may have been this is that the name needs to be URL encoded, if it holds any / in the name it will be a 404 - to fix add URLencode(). This will be default in later releases.

shariharan99 commented 8 years ago

I'm sorry for not responding earlier. I do not believe it was a user_access issue as I was able to perform other actions within the bucket. I also had appropriate permissions. I ended up going a different route so I am good now though. Thanks so much for following up.

On Sun, Aug 7, 2016 at 7:33 PM, Mark notifications@github.com wrote:

One thing I just came across that may have been this is that the name needs to be URL encoded, if it holds any / in the name it will be a 404

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cloudyr/googleCloudStorageR/issues/4#issuecomment-238116044, or mute the thread https://github.com/notifications/unsubscribe-auth/ANQny87KaL1w9WKsadQYOrPWLprVUcSFks5qdms0gaJpZM4JRGTG .

rmaranhao commented 6 years ago

Ok. AMAZING TOOL!

I arrived here while trying to authenticate in the "remote" machine.

It took me quite a while to compile some simple code, so I´ll leave it here. My problem? Transfer data from my computer to GCS.


#*****************************************************************
# Local Code
# I HAVE modified the environment variables to get this to work.
#*****************************************************************

setwd("RDataDir")
library(googleComputeEngineR)
vm <- gce_vm(template = "rstudio", name = "xxxx", 
             username = "xxxx", password = "xxxx", 
             predefined_type = "n1-highmem-2")

library(googleCloudStorageR)
setwd("xxxx")
gcs_save_image(file = ".RData", bucket = "xxxx",
               saveLocation = NULL, envir = parent.frame())

#*****************************************************************
# Remote Code
#*****************************************************************

#
# You must copy the JSON key file using the upload function
# The JSON key file must have access to your bucket.
#

install.packages("googleCloudStorageR", dependencies=T)
library(googleAuthR)
library(googleCloudStorageR)

options(googleAuthR.scopes.selected = "https://www.googleapis.com/auth/devstorage.full_control")

gar_auth_service("R-bucket.json", 
                 scope = "https://www.googleapis.com/auth/devstorage.full_control")

#Set Project and Bucket
proj <- "xxxx"
bucket <- "xxxx"

gcs_load(file=".RData", bucket = "xxxx")
rmaranhao commented 6 years ago

My Environment gained the following lines:

On windows it is located at: C:\Program Files\R\R-3.4.3\etc and is called Renviron.site The editor must have admin privileges to edit the file.

GCE_AUTH_FILE="D:/Instances.json"
GCE_DEFAULT_PROJECT_ID="r-project-xxxxx"
GCE_DEFAULT_ZONE="southamerica-east1-c"

GCS_AUTH_FILE="D:/R-bucket.json"
GCS_DEFAULT_PROJECT_ID="r-project-xxxxxxx"
GCS_DEFAULT_ZONE="southamerica-east1-c"
MarkEdmondson1234 commented 6 years ago

Hi @rmaranhao is your issue solved now? If you want to authenticate on a Google Compute Engine instance (perhaps launched via googleComputeEngineR or via the webUI) then I suggest using googleAuthR::gar_gce_auth() as that will use the shared authentication for the entire Google project e.g. if your bucket and VM is in the same project, it will just work.

samudzi commented 4 years ago

I am seeing this same error when attempting to use the gcs_upload function.

Code here: `uploadCSV <- function(filename="age_complete", df,path="~/ona/data/"){

setwd(path)

csv <- paste0(filename,".csv") write.csv(df, file = csv) payload = paste0(path,csv) browser() upload_try <- gcs_upload(payload) }`

json auth file and bucket name are set in the .Renviron file, and I am able to list buckets. So no connection issues. This is using the version of this repo available on CRAN.

MarkEdmondson1234 commented 4 years ago

Could you please open in a new issue @samudzi , and include the feedback you get when attempting to upload using 'options(googleAuthR.verbose = 2)' before the call

PaulMontesinosOA commented 2 years ago

Hello!

I got the same issue with gcs_get_object():

Error : lexical error: invalid char in json text.
                                       Not Found
                     (right here) ------^

Before, you mentioned that

One thing I just came across that may have been this is that the name needs to be URL encoded, if it holds any / in the name it will be a 404 - to fix add URLencode(). This will be default in later releases.

and I really think it comes from the presence of / since it only appears when I try to access subfolders inside my buckets.

URLencode() didn't seem to solve this. Is there something else I could try to solve this?

Thank you

MarkEdmondson1234 commented 2 years ago

Could you please open in a new issue, else I wont' be able to track and resolve it. If you can also include the code that generated the issue and your sessionInfo() that will also help :)