deis / controller

Deis Workflow Controller (API)
https://deis.com
MIT License
41 stars 53 forks source link

Adding exception field on release api #1325

Closed Cryptophobia closed 6 years ago

Cryptophobia commented 6 years ago

We noticed that the api was giving great exceptions when you make a request that fails, but we would like to store those detailed exceptions up to the release api object for version of the release on which they occurred. Similar to how failed=True is available on the release api object, there should be a field called exception containing the exception msg from the deis-controller when a release or build fails.

Let's say we make a /scale post request to POST http://deis.192.168.99.100.nip.io/v2/apps/testing-application/scale with the following payload:

{"cmd": 10}

With this PR, when a release fails this is the example of how the release api response will look:

GET http://deis.192.168.99.100.nip.io/v2/apps/testing-application/releases

{
    "count": 37,
    "next": null,
    "previous": null,
    "results": [
        {
            "uuid": "2f104d63-a023-466d-9023-fb77aad75556",
            "app": "testing-application",
            "owner": "admin",
            "created": "2017-08-28T15:28:01Z",
            "updated": "2017-08-28T15:32:07Z",
            "version": 37,
            "summary": "admin deployed a config that failed",
            "failed": true,
            "exception": "error: (app::deploy): (app::deploy): There was a problem while deploying v36 of testing-application-cmd. Additional information:\nNo nodes are available that match all of the following predicates:: Insufficient cpu (1).",
            "config": "c0d45750-1c5d-4c39-a72e-ae3cf9990d6c",
            "build": "3dca0365-e51b-4d8c-988e-2487cc9a179a"
        },
        {
            "uuid": "a265ac6d-980b-457f-af66-e4d081cd9637",
            "app": "testing-application",
            "owner": "admin",
            "created": "2017-08-28T15:23:57Z",
            "updated": "2017-08-28T15:23:57Z",
            "version": 36,
            "summary": "admin changed lifecycle_post_start cmd",
            "failed": false,
            "exception": null,
            "config": "47561aca-8017-4952-aa6b-5fa0a415a398",
            "build": "3dca0365-e51b-4d8c-988e-2487cc9a179a"
        }
...

With this small change we get a detailed message in the api telling us why a certain release failed that is more detailed than the summary message. As a benefit it is stored on the release API object rather than just given as a response to our API calls:

"exception": "error: (app::deploy): (app::deploy): There was a problem while deploying v36 of testing-application-cmd. Additional information:\nNo nodes are available that match all of the following predicates:: Insufficient cpu (1)."
deis-admin commented 6 years ago

Thanks for the contribution! Please ensure your commits follow our style guide. This code will be tested once a Deis maintainer reviews it.

mboersma commented 6 years ago

Jenkins, add to whitelist.

Cryptophobia commented 6 years ago

@mboersma : Fixed the tests. Now I am getting this from Jenkins:

ApplyLayer exit status 1 stdout: stderr: write /usr/lib/python3.5/config-3.5m-x86_64-linux-gnu/libpython3.5m-pic.a: no space left on device

mboersma commented 6 years ago

This did finally pass--I worked around the error by adding temporary commands to clean the docker graph up.

This is a pretty ambitious change and honestly we haven't had enough time to review it properly. I'd like to leave it out of the v2.18.0 release but keep the PR open if that's amenable to you @Cryptophobia, so maybe this can be merged into a new community fork of controller when that gets rolling. Let me know what you think.

Cryptophobia commented 6 years ago

@mboersma : sounds good. we can add it to the fork but we would still appreciate a review.

Cryptophobia commented 6 years ago

This PR has been recreated for the deisthree fork here: https://github.com/deisthree/controller/pull/1

Should we keep this open or close it to track the progress downstream?

bacongobbler commented 6 years ago

We should close this since we're not intending on merging this.