Open andrie opened 6 years ago
This still happens. Another terminal dump, in case it helps:
INFO 2018-04-24 14:54:35 +0100 master-replica-0 / [5/48 files][900.0 KiB/ 61.0
MiB] 1% Done
INFO 2018-04-24 14:54:35 +0100 master-replica-0 Copying gs://adv-cloudml-test-1
95616/r-cloudml/cache/ubuntu_16044_lts/r_3_4_4/r/packrat.tar...
INFO 2018-04-24 14:54:35 +0100 master-replica-0 / [6/48 files][ 3.0 MiB/ 61.0
MiB] 4% Done
IERROR: gcloud crashed (IOError): [Errno 0] Error
Most likely, this is external and we would need a consistent repro to open an issue with Google CloudML. I've seen this a couple times, but I can't hit this consistently.
got the same problem by applying mnist_mlp.R (https://github.com/rstudio/keras/blob/master/vignettes/examples/mnist_mlp.R) using cloudml_train on google cloud platform.
I think the download functionality does not work properly. I also do not have a local runs directory created as it does in the mnist_mlp.R script. I think job_collect is the problem
cloudml::job_collect('Project Name', destination = '../runs', view = 'save')
does not copy anything in the destination folder
Any Idea what we can do?
R commands:
library(cloudml) cloudml_train("mnist_mlp.R", config = "config.yml")
config.yml:
trainingInput: scaleTier: BASIC runtimeVersion: "2.1" pythonVersion: "3.7"
Most likely, this is external and we would need a consistent repro to open an issue with Google CloudML. I've seen this a couple times, but I can't hit this consistently.
did we make some progress here. I just saw that the issue is open for a long time
This may not be an R issues, but something on the CloudML end.
I received a crash report in the terminal, despite the job still running on CloudML.
This happens after submitting:
Terminal output: