Open justinjm opened 5 years ago
It can be really fussy - the only difference I see is the response URL:
https://automl.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/datasets
vs in your logs:
2019-07-10 14:08:45> Request: https://automl.googleapis.com/v1beta1/projects/XXXXXXXXXXXXXXXXXX/locations/us-central1/datasets/
2019-07-10 14:08:45> Body JSON parsed to: {"displayName":"test_02","tablesDatasetMetadata":{}}
e.g. trailing slash? You can turn that off by passing checkTrailingSlash = FALSE
in gar_api_generator()
Thank you for the guidance @MarkEdmondson1234, much appreciated and good catch :)
Initial attempt with your suggestion unfortunately yields the same error but will keep at it...
2019-07-10 23:53:52> Request Status Code: 400
2019-07-10 23:53:52> API returned error: List of found errors: 1.Field: dataset.dataset_metadata; Message: Required field not set.
2019-07-10 23:53:52> No retry attempted: List of found errors: 1.Field: dataset.dataset_metadata; Message: Required field not set.
Scopes: https://www.googleapis.com/auth/cloud-platform
App key: XXXXXXX.apps.googleusercontent.com
Method: filepath
Error: API returned: List of found errors: 1.Field: dataset.dataset_metadata; Message: Required field not set.
> readRDS("request_debug.rds")
$url
[1] "https://automl.googleapis.com/v1beta1/projects/XXXXXXX/locations/us-central1/datasets"
$request_type
[1] "POST"
$body_json
{"displayName":"test_02","tablesDatasetMetadata":{}}
Ah the actual request is removing the tablesDatasetMetaData
-> POST /v1beta1/projects/XXXXXXXXXXXXXXXXXX/locations/us-central1/datasets/ HTTP/1.1
-> Host: automl.googleapis.com
-> User-Agent: googleAuthR/0.7.0 (gzip)
-> Accept: application/json, text/xml, application/xml, */*
-> Content-Type: application/json
-> Accept-Encoding: gzip
-> Authorization: Bearer 1234567890123456
-> Content-Length: 25
->
>> {"displayName":"test_02"}
Its actually really weird they ask you to send in an empty field to be valid. Can you try with sending in a space instead? e.g. " "
@MarkEdmondson1234 Thank you for the continued help, really appreciate it!
I tried adding a space as suggested (2 different ways to be sure) and now getting a new - and perhaps encouraging since recognizing the tablesDatasetMetadata
field - error:
Error: API returned: Invalid JSON payload received. Unknown name "tablesDatasetMetadata" at 'dataset': Proto field is not repeating, cannot start list
-> POST /v1beta1/projects/xxxxx/locations/us-central1/datasets HTTP/1.1
-> Host: automl.googleapis.com
-> User-Agent: googleAuthR/0.7.0 (gzip)
-> Accept: application/json, text/xml, application/xml, */*
-> Content-Type: application/json
-> Accept-Encoding: gzip
-> Authorization: Bearer xxxxx
-> Content-Length: 55
->
>> {"displayName":"test_02","tablesDatasetMetadata":[" "]}
<- HTTP/1.1 400 Bad Request
<- Vary: Origin
<- Vary: X-Origin
<- Vary: Referer
<- Content-Type: application/json; charset=UTF-8
<- Content-Encoding: gzip
<- Date: Thu, 11 Jul 2019 12:06:41 GMT
<- Server: ESF
<- Cache-Control: private
<- X-XSS-Protection: 0
<- X-Frame-Options: SAMEORIGIN
<- X-Content-Type-Options: nosniff
<- Alt-Svc: quic=":443"; ma=2592000; v="46,43,39"
<- Transfer-Encoding: chunked
<-
2019-07-11 08:06:41> Request Status Code: 400
2019-07-11 08:06:41> API returned error: Invalid JSON payload received. Unknown name "tablesDatasetMetadata" at 'dataset': Proto field is not repeating, cannot start list.
2019-07-11 08:06:41> No retry attempted: Invalid JSON payload received. Unknown name "tablesDatasetMetadata" at 'dataset': Proto field is not repeating, cannot start list.
Scopes: https://www.googleapis.com/auth/cloud-platform
App key: xxxxxxxx.apps.googleusercontent.com
Method: filepath
Error: API returned: Invalid JSON payload received. Unknown name "tablesDatasetMetadata" at 'dataset': Proto field is not repeating, cannot start list.
So seems like the "tablesDatasetMetadata" field isn't a list object as API expects since it's "boxed in"? Strange to me since it's my understanding googleAuthR
handles the unboxing in jsonlite
? (here: googleAuthR/R/generator) Or am I missing something?
Or perhaps am I also misunderstanding how jsonlite
handles empty objects?
Although after finding this httr
issue: , I was stumped on any alternative ways to create and pass the JSON object through googleAuthR
!
Yes progres, you need jsonlite::unbox() to stop turning the entry into a list.
There is an example in googleLanguageR doing this, you apply unbox to the list object that is being turned into BODY eg.
jubox <- function(x) jsonlite::unbox(x) body <- list( document = list( type = jubox(type), language = jubox(language) ), encodingType = encodingType )
Thank you for sharing the function from googleLanguageR! I was looking for something like it in the googleAuthRVerse
and I missed that.
But AutoML still doesn't want to play nice after 2 more attempts :) Latest error:
-> POST /v1beta1/projects/xxxx/locations/us-central1/datasets HTTP/1.1
-> Host: automl.googleapis.com
-> User-Agent: googleAuthR/0.7.0 (gzip)
-> Accept: application/json, text/xml, application/xml, */*
-> Content-Type: application/json
-> Accept-Encoding: gzip
-> Authorization: Bearer xxxx
-> Content-Length: 55
->
>> {"displayName":"test_02","tablesDatasetMetadata":"{ }"}
<- HTTP/1.1 400 Bad Request
<- Vary: Origin
<- Vary: X-Origin
<- Vary: Referer
<- Content-Type: application/json; charset=UTF-8
<- Content-Encoding: gzip
<- Date: Thu, 11 Jul 2019 19:38:26 GMT
<- Server: ESF
<- Cache-Control: private
<- X-XSS-Protection: 0
<- X-Frame-Options: SAMEORIGIN
<- X-Content-Type-Options: nosniff
<- Alt-Svc: quic=":443"; ma=2592000; v="46,43,39"
<- Transfer-Encoding: chunked
<-
2019-07-11 15:38:26> Request Status Code: 400
2019-07-11 15:38:26> API returned error: Invalid value at 'dataset.tables_dataset_metadata' (type.googleapis.com/google.cloud.automl.v1beta1.TablesDatasetMetadata), "{ }"
2019-07-11 15:38:26> No retry attempted: Invalid value at 'dataset.tables_dataset_metadata' (type.googleapis.com/google.cloud.automl.v1beta1.TablesDatasetMetadata), "{ }"
Scopes: https://www.googleapis.com/auth/cloud-platform
App key: xxxx.apps.googleusercontent.com
Method: filepath
Error: API returned: Invalid value at 'dataset.tables_dataset_metadata' (type.googleapis.com/google.cloud.automl.v1beta1.TablesDatasetMetadata), "{ }"
Leaving this open, will focus on building out other functions. Non-programmatic and easy temporary solution: manually create a dataset in UI :)
Looks like api moving out of beta; v1 released recently. Will revisit this at some point to see if new api version is more welcoming :)
https://cloud.google.com/automl/docs/reference/rest/v1/projects.locations.datasets/create
Got a hack working! Next steps detailed in commit message: https://github.com/justinjm/googleCloudAutoMLTablesR/commit/7607075684ac3d0e88b340bf7de35358949c3a53
Summary
My goal is create a function to create a dataset in Google Cloud AutoML Tables. This function is in the AutoML Tables Python client library and used in a GCP tutorial and I'd like to emulate the functionality into an R package using
googleAuthR
as the framework for authentication and functions. Any hints or help from anyone would be much appreciated :)Thank you in advance! Justin
Hypothesis
It's likely this error related to api request body being improperly formatted before passing into
gar_api_generator()
. Throughout my trial and error (checkout git history for more), I've gotten different errors ofInvalid value
for the same fielddataset.dataset_metadata
. It seems like I need to send an empty or null value in thecreate.dataset
POST request.Documentation links
projects.locations.datasets.create
GCP Tutorial (using
create.dataset()
function from Python client library, snippet also below) - purchase_prediction.ipynbWhat goes wrong
Can't create a dataset via
gcat_create_dataset()
fails with 400 error:Steps to reproduce the problem
googleCloudAutoMLTablesR
from githubvignettes/quick_start.Rmd
Expected output
Referencing the CURL example from GCP documentation, here is the desired response
Actual output
gcat_create_dataset()
fails with 400 error'API Data failed to parse' diagnostics
Debug info
Session Info
What I've Tried
I've tried the following in trying to format the api request body properly here: googleCloudAutoMLTablesR/datasets.R
list()
structure(list(), names=character(0))
(inspired by SO answer)I also tried to copy/paste the CURL api request body from GCP documentation to see if I was missing something
CURL example: Creating and managing datasets | AutoML Tables Documentation | Google Cloud