rstudio / cloudml

R interface to Google Cloud Machine Learning Engine
https://tensorflow.rstudio.com/tools/cloudml/
65 stars 24 forks source link

running script yields strsplit error #218

Open datengefluester opened 3 years ago

datengefluester commented 3 years ago

Hi, I am trying to reproduce a tutorial for the package, which I found on youtube. However, when I try to submit a training file, I get the following error:

Submitting training job to CloudML...
Error in strsplit(a, "[.-]") : non-character argument

so far my toy code looks as follows:

library(cloudml)
gcloud_init()
cloudml::cloudml_train("test.R", config = "cloudml.yml")

where 'test.R' is basically the toy example from the 'xgboost' package:

library(xgboost)

data(agaricus.train, package='xgboost')
data(agaricus.test, package='xgboost')

train <- agaricus.train
test <- agaricus.test
bstSparse <- xgboost(data = train$data, label = train$label, max.depth = 2, eta = 1, nthread = 2, nrounds = 2, objective = "binary:logistic")

saveRDS(bstSparse, "bstSparse.rds")

the cloudml.yml contains the following (deleting everything but runtime, which is needed to prevent another error, does not solve the issue):

trainingInput:
  scaleTier: CUSTOM
  masterType: large_model
  runtimeVersion: 2.2

Here's some session info (I tried R 3.4.4 earlier but I got the same error):

sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS
cloudml_0.6.1

Any idea of what's wrong?

alpopesc commented 3 years ago

I have pretty much the same problem. I would be very much interested to see a resolution.

data-vader commented 3 years ago

the error comes from "runtimeVersion: 2.2"

as specified on https://cloud.google.com/ai-platform/training/docs/reference/rest/v1/projects.jobs#traininginput

you have to pass 2.2 as a string therefore use the following: runtimeVersion: "2.2"

datengefluester commented 3 years ago

@data-vader This solved the issue! Thank you so much!