GoogleCloudPlatform / cloudml-samples

Cloud ML Engine repo. Please visit the new Vertex AI samples repo at https://github.com/GoogleCloudPlatform/vertex-ai-samples
https://cloud.google.com/ai-platform/docs/
Apache License 2.0
1.52k stars 857 forks source link

Updates to Flowers tutorial gcloud beta ml syntax. #12

Closed bw4sz closed 7 years ago

bw4sz commented 7 years ago

It looks like there are a few changes to the gcloud beta ml syntax since this tutorial was made.

In sample.sh,

line 6 reads

PROJECT=$(gcloud config list project --format "value(core.project)")

should read

PROJECT=$(gcloud beta config list project --format "value(core.project)")

lines 70 and 76, gcloud beta ml versions

is now. gcloud beta ml models versions

also note that USER is never defined in the tutorial, but is required or else the GCS_PATH has an awkward __ space in it, and dataflow doesn't seem to like the empty folder name.

This fixed sample.sh for me, running from the docker instance specified here

docker pull gcr.io/cloud-datalab/datalab:local
joshgc commented 7 years ago

Could you check what version of gcloud you are using with gcloud version? I have a suspicion that docker image is old? I just checked on Cloud Shell (gcloud version 138) and didn't need "models" before "versions".

Thanks!

bw4sz commented 7 years ago

Hey Josh, Thanks for the feedback, i've been working through the tutorial for about a week. Great stuff.

root@db4cf62ec620:/# gcloud version
Google Cloud SDK 134.0.0

I pulled the docker image last week. Should I pull again? It might help explain some of the small changes I've found. Or should i just re-install google cloud on the docker image (its the one under the getting started instructions on the cloudml website.

I am also having trouble with

gcloud beta ml jobs stream-logs "$JOB_ID"

not returning anything, and having to add a manual sleep 25m to get it to wait until training is done.

bw4sz commented 7 years ago

Can you confirm whether users should be working from the :local or :latest tag.

I am up to date with local:

C:\Users\Ben\AppData\Local\Google\Cloud SDK>docker pull gcr.io/cloud-datalab/datalab:local
local: Pulling from cloud-datalab/datalab
386a066cd84a: Already exists
a3ed95caeb02: Already exists
e0d51b098569: Already exists
ef371b137a80: Already exists
64ce4acbd5e6: Already exists
985d50bd0761: Already exists
f0eda7b2aaa4: Already exists
812b210897b1: Already exists
3d0bf656e4a0: Already exists
c4ea72a3d790: Already exists
788988a83015: Already exists
617d5fd837d5: Already exists
5daded39bf68: Already exists
0902f780f103: Already exists
Digest: sha256:2fa1a7fb70397987311a49d3352509a24a201c1fecb726209fec221ed10589d1
Status: Image is up to date for gcr.io/cloud-datalab/datalab:local

I can pull :latest, but it gives quite a bit of errors trying to run.

C:\Users\Ben\AppData\Local\Google\Cloud SDK>docker run -it gcr.io/cloud-datalab/datalab
Your active configuration is: [NONE]

Your active configuration is: [NONE]

creating master branch
Initialized empty Git repository in /master_branch/.git/

*** Please tell me who you are.

Run

  git config --global user.email "you@example.com"
  git config --global user.name "Your Name"

to set your account's default identity.
Omit --global to set the identity only in this repository.

fatal: unable to auto-detect email address (got 'root@9d3b13fca782.(none)')
error: src refspec master does not match any.
error: failed to push some refs to 'https://source.developers.google.com/p//'
failed creating master branch

The clouml getting started guide

https://cloud.google.com/ml/docs/how-tos/getting-set-up

specifies local, which i'm up to date with.

bw4sz commented 7 years ago

For those following along at home @joshgc is correct, the install instructions tag :local, but going to the google container repo and grabbing a more recent build is successful.

C:\Users\Ben\AppData\Local\Google\Cloud SDK>docker run -it -p "127.0.0.1:8080:8080" --entrypoint=/bin/bash  gcr.io/cloud-datalab/datalab:local-20170108
root@8b9f503f4d45:/# gcloud beta ml versions
ERROR: (gcloud.beta.ml.versions) too few arguments
Usage: gcloud beta ml versions [optional flags] <command>
  command may be         create | delete | describe | list | set-default

For detailed information on this command and its flags, run:
  gcloud beta ml versions --help
bw4sz commented 7 years ago

Nope, I spoke too soon. That build has some issues.

root@8b9f503f4d45:/# gcloud init

Welcome! This command will take you through the configuration of gcloud.

Your current configuration has been set to: [default]

You can skip diagnostics next time by using the following flag:
  gcloud init --skip-diagnostics

Network diagnostic detects and fixes local network connection issues.
Checking network connection...done.
Reachability Check passed.
Network diagnostic (1/1 checks) passed.

ERROR: gcloud crashed (CannotConnectToMetadataServerException): <urlopen error [Errno -2] Name or service not known>

Will report back when I find a good build that has updated sdk

bw4sz commented 7 years ago

Forgive the peppering, I can confirm backing off a few days of build is successful. Dec 27th works fine.

docker run -it -p "127.0.0.1:8080:8080" --entrypoint=/bin/bash  gcr.io/cloud-datalab/datalab:local-20161227
root@e0befa21049f:~/MeerkatReader# gcloud version
Google Cloud SDK 138.0.0
alpha 2016.01.12
beta 2016.01.12
bq 2.0.24
bq-nix 2.0.24
core 2016.12.09
core-nix 2016.11.07
gcloud
gsutil 4.22
gsutil-nix 4.18
root@e0befa21049f:~/MeerkatReader#
joshgc commented 7 years ago

Hi Ben,

Thanks for all the feedback! A few thoughts: