cloudyr / googleCloudStorageR

Google Cloud Storage API to R
https://code.markedmondson.me/googleCloudStorageR
Other
104 stars 29 forks source link

Lexical error: invalid char in json text when getting an object from a subfolder with gcs_get_object #151

Closed PaulMontesinosOA closed 3 years ago

PaulMontesinosOA commented 3 years ago

Hello,

When I try to retrieve an object from a bucket, I get the following error when the object is in a subfolder:

gcs_get_object(object_name = "test_textfile_subfolder.txt",
               bucket = "gs://test_export_r/subfolder")

ℹ 2021-11-08 13:28:36 > Request Status Code:  404
Error : lexical error: invalid char in json text.
                                       Not Found
                     (right here) ------^

Not FoundError in gcs_get_object(object_name = "test_textfile_subfolder.txt", bucket = "gs://test_export_r/subfolder") : 
  File not found. Check object_name and if you have read permissions.
           Looked for test_textfile_subfolder.txt

However, when I do the same with a file that is not in a subfolder, it works perfectly.

gcs_get_object(object_name = "test_textfile.txt",
               bucket = "gs://test_export_r")
Downloaded test_textfile.txt
Object parsed to class: character
[1] "Nothing important, just a test file"

Here is the sessionInfo() output if that can help:

> sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 10 (buster)

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.5.so

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8        LC_COLLATE=C.UTF-8    
 [5] LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8    LC_PAPER=C.UTF-8       LC_NAME=C             
 [9] LC_ADDRESS=C           LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] googleCloudStorageR_0.6.0

loaded via a namespace (and not attached):
 [1] digest_0.6.27     assertthat_0.2.1  R6_2.5.1          jsonlite_1.7.2    zip_2.2.0        
 [6] httr_1.4.2        rlang_0.4.11      cachem_1.0.6      cli_3.0.1         curl_4.3.2       
[11] renv_0.14.0       rstudioapi_0.13   fs_1.5.0          googleAuthR_1.4.0 tools_4.1.1      
[16] glue_1.4.2        yaml_2.2.1        fastmap_1.1.0     compiler_4.1.1    askpass_1.1      
[21] gargle_1.2.0      memoise_2.0.0     openssl_1.4.5  
MarkEdmondson1234 commented 3 years ago

I think you need to specify the sub-folder in your object name not the bucket name:

gcs_get_object(object_name = "subfolder/test_textfile_subfolder.txt",
               bucket = "gs://test_export_r")
PaulMontesinosOA commented 3 years ago

Oh, it works. I now feel embarrassed that the solution is that simple.

Actually, specifying the full path in the object_name argument also works:

gcs_get_object(object_name = "gs://test_export_r/subfolder/test_textfile_subfolder.txt")

Thank you for your help!

MarkEdmondson1234 commented 3 years ago

No worries. Yes you can pass the full gs:// URI too as a convenience.