Deploying Managed VM: Hang on "Sending auth config"

GoogleCodeExporter commented 8 years ago

Issue running command [gcloud preview app deploy].

What is the expected output? 
I expected the managed vm to deploy.

What do you see instead?
The app hangs on the line "Sending auth config", then crashes. 

Please provide any additional information below.
I'm not the only one to report this. See this SO post:
http://stackoverflow.com/questions/34423857/gae-managed-vm-deploy-gets-stuck-on-
docker-build

I run the cmd with --verbosity debug, here is some of the output leading up to 
it and the callstack at crash:

    INFO: Display disabled.
    Copying certificates for secure access. You may be prompted to create an SSH keypair.
    DEBUG: Loaded Command Group: ['gcloud', 'compute', 'copy_files']
    DEBUG: Detected docker environment variables: DOCKER_HOST=tcp://104.154.78.118:2376, DOCKER_CERT_PATH=tmpv0pnrs, DOCKER_TLS_VERIFY=True
    DEBUG: Couldn't find 'auths' section
    DEBUG: Found entry (registry='https://bucket.gcr.io', username=u'_token')
    DEBUG: Found entry (registry='https://appengine.gcr.io', username=u'_token')
    DEBUG: Found entry (registry='https://eu.gcr.io', username=u'_token')
    DEBUG: Found entry (registry='https://gcr.io', username=u'_token')
    DEBUG: Found entry (registry='https://us.gcr.io', username=u'_token')
    DEBUG: Found entry (registry='https://asia.gcr.io', username=u'_token')
    DEBUG: Found entry (registry='https://b.gcr.io', username=u'_token')
    INFO: Starting new HTTPS connection (1): 104.154.78.118
    DEBUG: "GET /v1.16/_ping HTTP/1.1" 200 2
    Building and pushing image for module [default]
    INFO: Looking for the Dockerfile in C:\Users\Michael\Projects\Sites\DocketAlarm
    INFO: Looking for the default Dockerfile for runtime [python27]
    INFO: Dockerfile for runtime [python27] is found in C:\Users\Michael\AppData\Local\Google\Cloud SDK\google-cloud-sdk\lib\googlecloudsdk\api_lib\app\dockerfiles. Copying it into application directory.
    INFO: Building docker image docketupdate.default.20160104t034332 from C:\Users\Michael\Projects\Sites\DocketAlarm/Dockerfile:
    ------------------------------------------------- DOCKER BUILD OUTPUT --------------------------------------------------
    DEBUG: Looking for auth config
    DEBUG: Sending auth config ('https://bucket.gcr.io', 'https://appengine.gcr.io', 'https://eu.gcr.io', 'https://gcr.io',
    'https://us.gcr.io', 'https://asia.gcr.io', 'https://b.gcr.io')
    INFO: Tearing down remote build vm.
    Beginning teardown of remote build environment (this may take a few seconds).
    DEBUG: ('Connection aborted.', error(10054, 'An existing connection was forcibly closed by the remote host'))
    Traceback (most recent call last):
      File "C:\Users\Michael\AppData\Local\Google\Cloud SDK\google-cloud-sdk\lib\googlecloudsdk\calliope\cli.py", line 609,
    in Execute
        result = args.cmd_func(cli=self, args=args)
      File "C:\Users\Michael\AppData\Local\Google\Cloud SDK\google-cloud-sdk\lib\googlecloudsdk\calliope\backend.py", line 1228, in Run
        resources = command_instance.Run(args)
      File "C:\Users\Michael\AppData\Local\Google\Cloud SDK\google-cloud-sdk\lib\googlecloudsdk\surface\preview\app\deploy.py", line 256, in Run
        remote_build)
      File "C:\Users\Michael\AppData\Local\Google\Cloud SDK\google-cloud-sdk\lib\googlecloudsdk\api_lib\app\deploy_command_util.py", line 177, in BuildAndPushDockerImages
        info.file, docker_client, image_name)
      File "C:\Users\Michael\AppData\Local\Google\Cloud SDK\google-cloud-sdk\lib\googlecloudsdk\api_lib\app\deploy_command_util.py", line 248, in BuildAndPushDockerImage
        image.Build(docker_client)
      File "C:\Users\Michael\AppData\Local\Google\Cloud SDK\google-cloud-sdk\lib\googlecloudsdk\api_lib\app\docker_image.py", line 98, in Build
        nocache=self._nocache, rm=self._rm, pull=False)
      File "C:\Users\Michael\AppData\Local\Google\Cloud SDK\google-cloud-sdk\bin\..\./lib\third_party\docker\docker\api\build.py", line 97, in build
        timeout=timeout,
      File "C:\Users\Michael\AppData\Local\Google\Cloud SDK\google-cloud-sdk\bin\..\./lib\third_party\docker\docker\client.py", line 107, in _post
        return self.post(url, **self._set_request_timeout(kwargs))
      File "C:\Users\Michael\AppData\Local\Google\Cloud SDK\google-cloud-sdk\bin\..\./lib\third_party\requests\sessions.py", line 507, in post
        return self.request('POST', url, data=data, json=json, **kwargs)
      File "C:\Users\Michael\AppData\Local\Google\Cloud SDK\google-cloud-sdk\bin\..\./lib\third_party\requests\sessions.py", line 464, in request
        resp = self.send(prep, **send_kwargs)
      File "C:\Users\Michael\AppData\Local\Google\Cloud SDK\google-cloud-sdk\bin\..\./lib\third_party\requests\sessions.py", line 576, in send
        r = adapter.send(request, **kwargs)
      File "C:\Users\Michael\AppData\Local\Google\Cloud SDK\google-cloud-sdk\bin\..\./lib\third_party\requests\adapters.py", line 415, in send
        raise ConnectionError(err, request=request)
    ConnectionError: ('Connection aborted.', error(10054, 'An existing connection was forcibly closed by the remote host'))
    DEBUG: Loaded Command Group: ['gcloud', 'compute', 'instances', 'delete']
    ERROR: gcloud crashed (ConnectionError): ('Connection aborted.', error(10054, 'An existing connection was forcibly closed by the remote host'))

    If you would like to report this issue, please run the following command:
      gcloud feedback

Here is the output of gcloud info:
    Google Cloud SDK [91.0.1]

    Platform: [Windows, x86_64]
    Python Version: [2.7 (r27:82525, Jul  4 2010, 09:01:59) [MSC v.1500 32 bit (Intel)]]
    Python Location: [c:\Python27\python.exe]
    Site Packages: [Disabled]

    Installation Root: [C:\Users\Michael\AppData\Local\Google\Cloud SDK\google-cloud-sdk]
    Installed Components:
      core: [2015.12.16]
      bq-win: [2.0.18]
      core-win: [2015.11.24]
      gcloud: []
      windows-ssh-tools: [2015.09.21]
      gsutil: [4.16]
      gsutil-win: [4.15]
      bq: [2.0.18]
    System PATH: [C:\Users\Michael\AppData\Local\Google\Cloud SDK\google-cloud-sdk\bin\..\bin\sdk;C:\ProgramData\Oracle\Java\javapath;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;C:\Program Files\Ethereum 0.9.41\bin;C:\Program Files (x86)\Skype\Phone\;c:\Python27;C:\Python27\Scripts;C:\Program Files (x86)\Google\google_appengine\;C:\cygwin\bin;C:\Users\Michael\AppData\Local\Programs\Git\cmd;C:\Users\Michael\AppData\Local\Google\Cloud SDK\google-cloud-sdk\bin]
    Cloud SDK on PATH: [True]

    WARNING: There are old versions of the Google Cloud Platform tools on your system PATH.
      C:\Program Files (x86)\Google\google_appengine\dev_appserver.py

    Installation Properties: [C:\Users\Michael\AppData\Local\Google\Cloud SDK\google-cloud-sdk\properties]
    User Config Directory: [C:\Users\Michael\AppData\Roaming\gcloud]
    User Properties: [C:\Users\Michael\AppData\Roaming\gcloud\properties]
    Current Workspace: [None]
    Workspace Config Directory: [None]
    Workspace Properties: [None]

    Account: [michael.sander@gmail.com]
    Project: [docketupdate]

    Current Properties:
      [core]
        project: [docketupdate]
        account: [michael.sander@gmail.com]
        disable_usage_reporting: [False]
      [app]
        promote_by_default: [false]
        suppress_change_warning: [true]
      [compute]
        region: [us-central1]
        zone: [us-central1-f]

    Logs Directory: [C:\Users\Michael\AppData\Roaming\gcloud\logs]
    Last Log File: [C:\Users\Michael\AppData\Roaming\gcloud\logs\2016.01.04\03.54.01.581000.log]

Original issue reported on code.google.com by michael....@gmail.com on 4 Jan 2016 at 8:57

GoogleCodeExporter commented 8 years ago

Thanks for reporting this. We'll investigate.

Original comment by z...@google.com on 4 Jan 2016 at 1:54

GoogleCodeExporter commented 8 years ago

Let me know if you'd like the full log files, don't think they contain much 
more than what I've already posted.

Original comment by michael....@gmail.com on 4 Jan 2016 at 4:02

GoogleCodeExporter commented 8 years ago

Michael, are you the same person that reported the issue on Stack Overflow: 
http://stackoverflow.com/questions/34423857/gae-managed-vm-deploy-gets-stuck-on-
docker-build ?

Either way, are you able to reproduce this consistently? If so, could you share 
your Dockerfile? Feel free to email it to me directly at dlorenc@google.com

If you're able to SSH into the VM, the output of "cat /var/log/docker.log" 
might be helpful.

Original comment by dlor...@google.com on 5 Jan 2016 at 10:09

GoogleCodeExporter commented 8 years ago

I was the one who commented on the post, but I'm not the original poster.
I'll look into pulling those files.

Michael Sander
michael.sander@gmail.com
607-227-9859

Original comment by michael....@gmail.com on 6 Jan 2016 at 2:33

GoogleCodeExporter commented 8 years ago

I've been having this issue too with my node app. 

In my office, some people can deploy the app just fine and others are hitting 
this bug, despite us all using the same version of gcloud SDK (91.0.1), the 
same OS (OS X 10.11), same version of python (2.7.10 apple-compatible), etc.

I can provide some info that you asked for too, see below.

Dockerfile:
FROM node:4.2.3
RUN apt-get update -y && apt-get install -y -q --fix-missing \
  graphicsmagick \
  imagemagick
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
COPY package.json /usr/src/app/
RUN npm --unsafe-perm install  --production
COPY . /usr/src/app
CMD [ "npm", "start" ]

I SSHed into the VM and the output of /var/log/docker.log just repeats these 
view messages (the first one a lot more frequently than the others):
time="2016-01-06T14:53:01.689552671Z" level=info msg="GET /version"
time="2016-01-06T14:53:04.881663725Z" level=info msg="GET /images/json"
time="2016-01-06T14:53:04.928656071Z" level=info msg="GET 
/containers/json?all=1"

Original comment by rich...@10kcoffees.com on 6 Jan 2016 at 4:18

GoogleCodeExporter commented 8 years ago

Hmm, one idea, Docker stores credentials in a file called ".dockercfg" if you 
run "cat ~/.dockercfg", do you see any entries for "index.docker.io"?

Don't paste the contents of .dockercfg here, for registries other than GCR it 
can contain private credentials.

Original comment by dlor...@google.com on 6 Jan 2016 at 5:17

GoogleCodeExporter commented 8 years ago

Original comment by z...@google.com on 6 Jan 2016 at 5:30

Changed state: NeedsMoreInfo

GoogleCodeExporter commented 8 years ago

I don't have an entry there for "index.docker.io", though all the other entries 
have the email set to "not@val.id" (I'm guessing they're examples).

I'm not sure if this is the cause though, as I can deploy other apps without 
hitting this issue.

Original comment by rich...@10kcoffees.com on 6 Jan 2016 at 5:46

GoogleCodeExporter commented 8 years ago

Same here. Below is the file with the auth sections redacted. I'm running on 
windows btw, so I don't have a ~/.dockercfg. I snooped around and found it in 
C:\home\Michael\.dockercfg, which is a bit of a weird place to put it.

{
  "https://bucket.gcr.io": {
    "email": "not@val.id", 
    "auth": "<REDACTED 1>"
  }, 
  "https://appengine.gcr.io": {
    "email": "not@val.id", 
    "auth": "<REDACTED 1>"
  }, 
  "https://eu.gcr.io": {
    "email": "not@val.id", 
    "auth": "<REDACTED 1>"
  }, 
  "https://gcr.io": {
    "email": "not@val.id", 
    "auth": "<REDACTED 2>"
  }, 
  "https://us.gcr.io": {
    "email": "not@val.id", 
    "auth": "<REDACTED 2>"
  }, 
  "https://asia.gcr.io": {
    "email": "not@val.id", 
    "auth": "<REDACTED 1>"
  }, 
  "https://b.gcr.io": {
    "email": "not@val.id", 
    "auth": "<REDACTED 1>"
  }
}

Original comment by michael....@gmail.com on 6 Jan 2016 at 6:40

GoogleCodeExporter commented 8 years ago

I am facing a similar issue as well.
preview app deploy gets stuck in "Docker Build Ouput.." and then exits after 
about 10 mins. The VM is created but docker ps returns empty on the VM. Output 
of gcloud preview app..and /var/log/docker.log is pasted below.

/var/log/docker.log
time="2016-01-25T17:35:26Z" level=info msg="+job 
serveapi(unix:///var/run/docker.sock, tcp://0.0.0.0:2376)" 
time="2016-01-25T17:35:26Z" level=info msg="Listening for HTTP on unix 
(/var/run/docker.sock)" 
time="2016-01-25T17:35:26Z" level=info msg="Listening for HTTP on tcp 
(0.0.0.0:2376)" 
time="2016-01-25T17:35:26Z" level=info msg="+job init_networkdriver()" 
time="2016-01-25T17:35:27Z" level=info msg="-job init_networkdriver() = OK (0)" 
time="2016-01-25T17:35:27Z" level=warning msg="Your kernel does not support 
cgroup swap limit." 
time="2016-01-25T17:35:27Z" level=info msg="Loading containers: start." 
time="2016-01-25T17:35:27Z" level=info msg="Loading containers: done." 
time="2016-01-25T17:35:27Z" level=info msg="docker daemon: 1.6.2 7c8fca2; 
execdriver: native-0.2; graphdriver: aufs
" 
time="2016-01-25T17:35:27Z" level=info msg="+job acceptconnections()" 
time="2016-01-25T17:35:27Z" level=info msg="-job acceptconnections() = OK (0)" 
time="2016-01-25T17:35:27Z" level=info msg="Daemon has completed 
initialization" 
time="2016-01-25T17:35:47Z" level=info msg="GET /version" 
time="2016-01-25T17:35:47Z" level=info msg="+job version()" 
time="2016-01-25T17:35:47Z" level=info msg="-job version() = OK (0)" 
time="2016-01-25T17:35:48Z" level=info msg="GET /v1.18/_ping" 
time="2016-01-25T17:35:48Z" level=info msg="POST 
/v1.18/build?pull=0&nocache=False&q=False&t=bigtable-1152.default.
20160125t230437&forcerm=False&rm=True" 
time="2016-01-25T17:35:48Z" level=info msg="+job build()" 

Output of gcloud preview app deploy --verbosity=debug
DEBUG: Couldn't find 'auths' section
DEBUG: Found entry (registry=u'https://bucket.gcr.io', username=u'_token')
DEBUG: Found entry (registry=u'https://appengine.gcr.io', username=u'_token')
DEBUG: Found entry (registry=u'https://eu.gcr.io', username=u'_token')
DEBUG: Found entry (registry=u'https://gcr.io', username=u'_token')
DEBUG: Found entry (registry=u'https://us.gcr.io', username=u'_token')
DEBUG: Found entry (registry=u'https://asia.gcr.io', username=u'_token')
DEBUG: Found entry (registry=u'https://b.gcr.io', username=u'_token')
INFO: Starting new HTTPS connection (1): 104.197.210.22
DEBUG: "GET /version HTTP/1.1" 200 151
DEBUG: "GET /v1.18/_ping HTTP/1.1" 200 2
Building and pushing image for module [default]
INFO: Generating Dockerfile.
INFO: Saving [Dockerfile] to 
[/Users/vinay/src/yawasa-server/appengine-backend/build/staged-app].
INFO: Building docker image bigtable-1152.default.20160125t231615 from 
/Users/vinay/src/yawasa-server/appengine-backend/build/staged-app/Dockerfile:
--------------------------------------------- DOCKER BUILD OUTPUT 
---------------------------------------------
DEBUG: Looking for auth config
DEBUG: Sending auth config (u'https://bucket.gcr.io', 
u'https://appengine.gcr.io', u'https://eu.gcr.io', u'https://gcr.io', 
u'https://us.gcr.io', u'https://asia.gcr.io', u'https://b.gcr.io')
$

Original comment by chitlan...@gmail.com on 25 Jan 2016 at 5:51

GoogleCodeExporter commented 8 years ago

One update. When I reduce the number of jars in the WEB-INF/lib directory, the 
docker build succeeds. Perhaps there is a timeout hitting somewhere, which does 
not write any logs..so there is nothing in the logs to help.

Original comment by chitlan...@gmail.com on 26 Jan 2016 at 6:31

GoogleCodeExporter commented 8 years ago

Hello I'm not quite sure if this will help in your case but I had the same 
issue with deployment handing in the "Sending auth config" line and the problem 
was that we had a previous docker configuration files inside the repo which 
included .dockerignore file that had Dockerfile added to it. Once I removed the 
Dockerfile from .dockerignore the deployment ran through successfully.

Original comment by ja...@prioridata.com on 26 Jan 2016 at 9:42

GoogleCodeExporter commented 8 years ago

Hey All, 

I think I figured this out (at least on my system).

TLDR: delete large files in your project directory or add them to .dockerignore 
(even if they're under .gitignore or skipfiles).

On my system, it isn't actually hanging when it says "sending auth config", but 
rather, it's sending all of the files to the docker build process. If you take 
a look at your network connection, you'll notice that the process is using the 
network pretty heavily. The line that it spends all of its time on is the 
self._post call in the file:
   google-cloud-sdk\lib\third_party\docker\docker\api\build.py

To debug this issue, I instrumented the call to self._post so it would print 
out its sending progress. It's slow. It takes about 2 seconds to transfer one 
megabyte. Even though my application code is only about 150mb, I had a bunch of 
data files (~1GB) and some test files (~500MB) lying around. These files were 
in my .gitignore and are listed in the skip_files section of app.yaml. However, 
when you create a managed vm instance, those files are ignored. Docker sends 
everything. At that upload rate, it would take well over an hour to deploy. At 
some point well before that (around 12 mins in), the connection gets dropped.

The work-around solution is to edit your .dockerignore file to include all of 
the files you do not want uploaded. The better solution would be for gcloud to 
read the skip_files configuration in your app.yaml file, and automatically 
create a .dockerignore that adheres to what is listed in skip_files.

With all that said, I do think there are some bugs here:
1 - The TAR upload is excruciatingly slow, well less than 1.5mbps. There's 
something wrong here, my network is faster than that.
2 - The TAR files should be gzipped before being sent. It would save 
significantly on time.
3 - There seems to be a timeout on the GCE side that shuts down the connection 
early.

Below is a the output of my instrumentation code (abridged), just so you can 
see the approximate speed of the upload.

    ---------- DOCKER BUILD OUTPUT --------------------------------------------
    DEBUG: Looking for auth config
    DEBUG: Sending auth config ('https://bucket.gcr.io', 'https://appengine.gcr.io', 'https://eu.gcr.io', 'https://gcr.io', 'https://us.gcr.io', 'https://asia.gcr.io', 'https://b.gcr.io')
    DEBUG: Sent 1 megabytes. Time elapsed: 0:00:03.119000
    DEBUG: Sent 2 megabytes. Time elapsed: 0:00:05.110000
    DEBUG: Sent 3 megabytes. Time elapsed: 0:00:07.041000
    DEBUG: Sent 4 megabytes. Time elapsed: 0:00:11.313000
    DEBUG: Sent 5 megabytes. Time elapsed: 0:00:16.782000
    DEBUG: Sent 6 megabytes. Time elapsed: 0:00:23.067000
    DEBUG: Sent 7 megabytes. Time elapsed: 0:00:30.161000
    DEBUG: Sent 8 megabytes. Time elapsed: 0:00:36.966000
    DEBUG: Sent 9 megabytes. Time elapsed: 0:00:43.900000
    DEBUG: Sent 10 megabytes. Time elapsed: 0:00:49.451000
    DEBUG: Sent 11 megabytes. Time elapsed: 0:00:55.113000
    DEBUG: Sent 12 megabytes. Time elapsed: 0:01:00.673000
    DEBUG: Sent 13 megabytes. Time elapsed: 0:01:06.226000
    DEBUG: Sent 14 megabytes. Time elapsed: 0:01:12.447000
    DEBUG: Sent 15 megabytes. Time elapsed: 0:01:17.701000
    DEBUG: Sent 16 megabytes. Time elapsed: 0:01:23.270000
    DEBUG: Sent 17 megabytes. Time elapsed: 0:01:29.067000
    DEBUG: Sent 18 megabytes. Time elapsed: 0:01:34.120000
    DEBUG: Sent 19 megabytes. Time elapsed: 0:01:39.493000
    DEBUG: Sent 20 megabytes. Time elapsed: 0:01:43.724000
    DEBUG: Sent 21 megabytes. Time elapsed: 0:01:48.390000
    DEBUG: Sent 22 megabytes. Time elapsed: 0:01:52.796000
    DEBUG: Sent 23 megabytes. Time elapsed: 0:01:57.806000
    DEBUG: Sent 24 megabytes. Time elapsed: 0:02:02.722000
    DEBUG: Sent 25 megabytes. Time elapsed: 0:02:06.590000
    DEBUG: Sent 26 megabytes. Time elapsed: 0:02:10.925000
    DEBUG: Sent 27 megabytes. Time elapsed: 0:02:15.632000
    DEBUG: Sent 28 megabytes. Time elapsed: 0:02:20.260000
    DEBUG: Sent 29 megabytes. Time elapsed: 0:02:25.226000
    DEBUG: Sent 30 megabytes. Time elapsed: 0:02:30.349000
    DEBUG: Sent 31 megabytes. Time elapsed: 0:02:34.791000
    DEBUG: Sent 32 megabytes. Time elapsed: 0:02:53.237000
    DEBUG: Sent 33 megabytes. Time elapsed: 0:03:17.726000
    DEBUG: Sent 34 megabytes. Time elapsed: 0:03:42.369000
    DEBUG: Sent 35 megabytes. Time elapsed: 0:04:06.470000
    DEBUG: Sent 36 megabytes. Time elapsed: 0:04:30.843000
    DEBUG: Sent 37 megabytes. Time elapsed: 0:04:55.262000
    DEBUG: Sent 38 megabytes. Time elapsed: 0:05:19.619000
    DEBUG: Sent 39 megabytes. Time elapsed: 0:05:44.124000
    DEBUG: Sent 40 megabytes. Time elapsed: 0:06:08.610000
    DEBUG: Sent 41 megabytes. Time elapsed: 0:06:33.298000
    DEBUG: Sent 42 megabytes. Time elapsed: 0:06:58.064000

Original comment by michael....@gmail.com on 9 Feb 2016 at 9:47

GoogleCodeExporter commented 8 years ago

FYI, I submitted the following pull request to Docker which will speed up 
transferring the tar ball.

Original comment by michael....@gmail.com on 9 Feb 2016 at 10:26

GoogleCodeExporter commented 8 years ago

https://github.com/docker/docker-py/pull/930

Original comment by michael....@gmail.com on 9 Feb 2016 at 10:26

GoogleCodeExporter commented 8 years ago

I am having the same issues with a node.js app. I was able to deploy without 
any problem when the size of all files where about 1 MB. Now I added assets and 
reached 166MB. The deployment fails 100% of the time. I have the log files I 
can send for investigation.

Original comment by pie...@kywix.com on 16 Feb 2016 at 4:09

GoogleCodeExporter commented 8 years ago

Thanks michael. 

I can confirm that deploy worked fine once I updated .dockerignore to exclude 
directories with large files, eg. env,venv, etc.

Original comment by anuj.lut...@redbubble.com on 12 Mar 2016 at 3:50

GoogleCodeExporter commented 8 years ago

So, is this issue resolved? Can anyone comment on whether they are still seeing 
this issue?

Original comment by pay...@google.com on 4 Apr 2016 at 8:16

GoogleCodeExporter commented 8 years ago

Not really resolved, we just found a work-around (i.e., make sure your 
.dockerignore file ignores all unnecessary items, see above for more details).

I do think there are some bugs here:
1 - The TAR upload is excruciatingly slow, well less than 1.5mbps. There's 
something wrong here, my network is faster than that.
2 - The TAR files should be gzipped before being sent. It would save 
significantly on time.
3 - There seems to be a timeout on the GCE side that shuts down the connection 
early.
4 - There should be some logging output on the status of the upload, it appears 
to hang.
5 - The Managed VM documentation should indicate that .dockerignore should be 
used to ignore files for the upload process. The "skip_files" listed in the app 
engine app.yaml is not used.

Original comment by michael....@gmail.com on 5 Apr 2016 at 1:32

GoogleCodeExporter commented 8 years ago

Thanks for clarifying all of these related issues, @michael.sander. Do you 
think that you could potentially produce a reproducing example app which is 
stripped of any sensitive code of your own organization, which can be used to 
observe this behaviour? Being able to run that under a microscope and look into 
all these related issues you've brought up would really help ensure that each 
of them gets fairly and adequately investigated to yield a solution if it is as 
you've surmised. 

As a general note, thanks for your thorough reporting in this thread, it's an 
excellent example of thorough issue reporting.

Original comment by pay...@google.com on 5 Apr 2016 at 8:49

GoogleCodeExporter commented 8 years ago

This is what you would need to do it:
- Create a sample managed VM app (any app will do)
- Create many thousands of files with random data in them in the same 
directory. These files will simulate temp files, or files that you usually 
would not want on your managed VM app.
- Delete any .dockerignore file, if one exists.
- Deploy
That should reproduce the issues discussed above.

Original comment by michael....@gmail.com on 6 Apr 2016 at 12:55

GoogleCodeExporter commented 8 years ago

Thanks @michael.sanders for the concise and clear points/instructions.

Unfortunately, we cannot guarantee network bandwidth for remote builds. Though 
not favorable, it's not entirely clear that 1.5 mbps is unusually or 
consistently slow. In testing, I could not reliably get the speed to remain 
that low. I have had some luck when encountering a slow upload by canceling and 
redeploying.

The remaining points however are interesting points towards improving the 
platform. For optimal treatment and traction, I would suggest a few of these be 
filed as their own issues[1]:

- TAR files could indeed be gzipped before being sent to save bandwidth and 
time when deploying an application in Flex environment
- Specify in Flexible Environment documentation[2] that skip_files is ignored 
and .dockerignore should be used or add a feature that creates a .dockerignore 
file from the skip_files element of the app.yaml

Given the original subject of this thread, I would recommend that the logging 
of object uploads in debug mode for remote builds be the feature request of 
this thread.

[1]: 
https://code.google.com/p/google-cloud-sdk/issues/entry?template=Defect%20report
%20from%20developer
[2]: https://cloud.google.com/appengine/docs/flexible/

Original comment by ngeli...@google.com on 8 Apr 2016 at 7:49

GoogleCodeExporter commented 8 years ago

The change to gzip the tarball is straightforward. I already submitted a Docker 
pull request to ensure that the TAR balls are gzipped: 
https://github.com/docker/docker-py/pull/930

The request was merged, but the maintainers did not turn on gzipping by 
default, it's a parameter which defaults to off. Accordingly, you'll need to  
make a change, but it's only a single line of code.

In the following file:
   Cloud SDK\google-cloud-sdk\lib\googlecloudsdk\api_lib\app\docker_image.py

Around line 108, you'll need to change this:

    build_res = docker_client.build(
        path=self._dockerfile_dir, tag=self.tag, quiet=False, fileobj=None,
        nocache=self._nocache, rm=self._rm, pull=False)

To this:

    build_res = docker_client.build(
        path=self._dockerfile_dir, tag=self.tag, quiet=False, fileobj=None,
        nocache=self._nocache, rm=self._rm, pull=False, gzip=True)

Original comment by michael....@docketalarm.com on 9 Apr 2016 at 6:31

GoogleCodeExporter commented 8 years ago

Thanks Michael. We'll look into making this change.

Quick question: have you tried with `gcloud config set use_cloud_build true`? 
Does this setting work for you?

Original comment by z...@google.com on 11 Apr 2016 at 6:16

GoogleCodeExporter commented 8 years ago

Yes, that is what I've been using the entire time.

gcloud config list
Your active configuration is: [default]

[app]
promote_by_default = False
suppress_change_warning = true
use_cloud_build = True
[compute]
region = us-central1
zone = us-central1-f
[core]
[REDACTED]

Original comment by michael....@docketalarm.com on 12 Apr 2016 at 7:28

venf2k / google-cloud-sdk

Deploying Managed VM: Hang on "Sending auth config" #456