cloudfoundry / cli

The official command line client for Cloud Foundry
https://docs.cloudfoundry.org/cf-cli
Apache License 2.0
1.75k stars 927 forks source link

Tolerate CF API errors while polling for apps to start #2160

Open tcdowney opened 3 years ago

tcdowney commented 3 years ago

What's the user value of this feature request? Current when restarting an app with cf restart the CLI polls the CF API repeatedly for process stats. If any of those requests fails the command will exit.

I believe that instead of exiting immediately the CLI should (potentially) log a warning and continue to poll up until the regular polling timeout is reached. It's likely that the failure was due to a transient issue (problem with a particular API, problem with metrics, etc.) that is unrelated to the app starting and that a subsequent request will be healthy.

Who is the functionality for? This will improve the developer experience for app developers and app operators.

How often will this functionality be used by the user? This will result in less noise around false failed pushes/restarts. Especially for users who script around the CLI and can't easily rerun the command.

Who else is affected by the change? This will be a breaking change for CLI users who expect the polling for app start to exit immediately on the first API error. I don't anticipate this is many people since it's already in a polling loop and they're probably more interested in the state of their apps.

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like I would like for the CLI to allow for API errors from Cloud Controller while polling for app start. Instead of passing the error up and failing immediately it would be nice if it either continued to poll until the polling timeout or allow for some number of errors to happen before failing.

The places I'm aware of this happening are:

Describe alternatives you've considered The requests could be retried immediately, but given we're already polling I think that's unnecessary and could cause more harm than good.

Additional context This affects us frequently in CI. In our case (cf-for-k8s) app metrics are not available immediately when Pod containers are terminating/created and the call to /v3/processes/:guid/stats may fail initially. This manifests as errors from the CLI where it just prints FAILED and we see test failures like this:

• Failure [140.125 seconds]
revisions stopping and starting when environment variables have changed on the app [It] creates a new revision 
/tmp/build/6583dfc4/capi-bara-tests/baras/revisions.go:88

  Unexpected error:
      <*json.SyntaxError | 0xc000b25cc0>: {
          msg: "invalid character 'F' looking for beginning of value",
          Offset: 1,
      }
      invalid character 'F' looking for beginning of value
  occurred

  /tmp/build/6583dfc4/capi-bara-tests/helpers/v3_helpers/deployment.go:103
cf-gitbot commented 3 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/177874838

The labels on this github issue will be updated when the story is started.

gururajsh commented 1 week ago

Hi @tcdowney. Is this still an issue with CLI v8?