cloudfoundry / diego-release

BOSH Release for Diego
Apache License 2.0
201 stars 212 forks source link

StagingError occurs during pushing APP to cf with diego enabled #150

Closed StanleyShen closed 8 years ago

StanleyShen commented 8 years ago

When pushing APP to latest cf instance with diego enabled. I can see StagingError

:hello stanleyshen$ cf enable-diego stanley
Setting stanley Deigo support to true
Ok

Verifying stanley Deigo support is set to true
Ok

:hello stanleyshen$ cf start stanley
Starting app stanley in org test / space test as admin...
Downloading staticfile_buildpack...
Downloading python_buildpack...
Downloading binary_buildpack...
Downloading php_buildpack...
Downloading go_buildpack...
Downloading java_buildpack...
Downloading ruby_buildpack...
Downloading nodejs_buildpack...
Downloading admin_console_buildpack...
Downloaded java_buildpack (160.7K)
Downloaded binary_buildpack (8.3K)
Downloaded staticfile_buildpack (2.5M)
Downloaded nodejs_buildpack (50.1M)
Downloaded python_buildpack (254M)

FAILED
StagingError

TIP: use 'cf logs stanley --recent' for more information

The env is

:hello stanleyshen$ bosh stemcells
Acting as user 'admin' on 'Bosh Lite Director'

+---------------------------------------------+---------------+---------+--------------------------------------+
| Name                                        | OS            | Version | CID                                  |
+---------------------------------------------+---------------+---------+--------------------------------------+
| bosh-warden-boshlite-ubuntu-trusty-go_agent | ubuntu-trusty | 3147*   | d0b3e660-03e4-440d-48ee-afd1378a29ee |
+---------------------------------------------+---------------+---------+--------------------------------------+

(*) Currently in-use

Stemcells total: 1
:hello stanleyshen$ bosh releases
Acting as user 'admin' on 'Bosh Lite Director'

+-------------------+-----------------+-------------+
| Name              | Versions        | Commit Hash |
+-------------------+-----------------+-------------+
| cf                | 233+dev.1*      | 4ef5b279+   |
| diego             | 0.1460.0+dev.1* | 972164ef    |
| etcd              | 38*             | af52789b+   |
| garden-linux      | 0.334.0*        | 88f7ec39    |
+-------------------+-----------------+-------------+

The APP I pushed is a very simple nodejs application. It looks like the downloading of buildpack is not stable. If I start the APP again, and I can see it works this times.

cf-gitbot commented 8 years ago

We have created an issue in Pivotal Tracker to manage this. You can view the current status of your issue at: https://www.pivotaltracker.com/story/show/116347719.

StanleyShen commented 8 years ago
:hello stanleyshen$ cf start stanley
Starting app stanley in org test / space test as admin...
Downloading ruby_buildpack...
Downloading go_buildpack...
Downloading staticfile_buildpack...
Downloading java_buildpack...
Downloading nodejs_buildpack...
Downloading php_buildpack...
Downloading python_buildpack...
Downloading binary_buildpack...
Downloaded staticfile_buildpack (2.5M)

FAILED
StagingError

TIP: use 'cf logs stanley --recent' for more information
:hello stanleyshen$ cf start stanley
Starting app stanley in org test / space test as admin...
Downloading staticfile_buildpack...
Downloading go_buildpack...
Downloading java_buildpack...
Downloading nodejs_buildpack...
Downloading php_buildpack...
Downloading binary_buildpack...
Downloading ruby_buildpack...
Downloading python_buildpack...
Downloaded binary_buildpack (8.3K)
Downloaded nodejs_buildpack (50.1M)

FAILED
StagingError

TIP: use 'cf logs stanley --recent' for more information
:hello stanleyshen$ cf start stanley
Starting app stanley in org test / space test as admin...
Downloading nodejs_buildpack...
Downloading python_buildpack...
Downloading go_buildpack...
Downloading php_buildpack...
Downloading java_buildpack...
Downloading binary_buildpack...
Downloading ruby_buildpack...
Downloading staticfile_buildpack...
Downloading python_buildpack failedDownloading staticfile_buildpack failedDownloading java_buildpack failedDownloading php_buildpack failedDownloaded ruby_buildpack (244M)
Downloaded binary_buildpack (8.3K)
Downloaded java_buildpack (160.7K)
Downloaded nodejs_buildpack (50.1M)
Downloaded go_buildpack (366.2M)

FAILED
StagingError

TIP: use 'cf logs stanley --recent' for more information
:hello stanleyshen$ cf start stanley
Starting app stanley in org test / space test as admin...
Downloading nodejs_buildpack...
Downloading java_buildpack...
Downloading staticfile_buildpack...
Downloading ruby_buildpack...
Downloading php_buildpack...Downloading python_buildpack...
Downloading binary_buildpack...
Downloading go_buildpack...
Downloaded binary_buildpack
Downloaded go_buildpack
Downloaded python_buildpack
Downloaded php_buildpack
Downloaded staticfile_buildpack
Downloaded ruby_buildpack
Downloaded nodejs_buildpack
Downloaded java_buildpack (160.7K)
Creating container
Successfully created container
Downloading app package...
Downloaded app package (53.6K)
Staging...

I tried on non bosh-lite env too and I ran into this issue too, I cannot see this issue if I didn't deploy diego on it. It happens on a clean Cell I think, once all the buildpacks are downloaded, if I delete the app and push it again, I cannot see this issue again.

emalm commented 8 years ago

Thanks for the report, @StanleyShen. We do see this issue intermittently on our own BOSH-Lite deployments. It's coming up as an issue at all primarily because the cell rep downloads buildpacks into its download cache only on demand when requested by a buildpack-app staging task. When an auto-detect staging task comes in to a newly created or updated cell, it therefore causes it to download all the buildpacks at once. This can be slow, but we're looking into why it sometimes fails even before the staging timeout is reached. We'll be investigating these failures end-to-end in https://www.pivotaltracker.com/story/show/116232733, but we suspect it may be a combination of a relatively short 5-second idle timeout on the downloader and lack of responsiveness from Cloud Controller or the backing blobstore on constrained infrastructures such as BOSH-Lite. Since we already have this investigation scheduled, I'll close this issue for now.

Best, Eric