Azure / azure-cli

Azure Command-Line Interface
MIT License
3.99k stars 2.97k forks source link

Writing to `stderr` causes failure on pipeline tasks with `failOnStderr` set to `true` #18372

Closed rajarshi-singh closed 1 year ago

rajarshi-singh commented 3 years ago

Describe the bug We have been noticing the following warning after upgrading from Az Cli 2.0.73 to 2.24 in our pipeline operation for purging AFD cache. It is causing the pipeline to fail. The operation depends on an extension of Az Cli called "front-door", although I have tried reverting the extension to original version (1.0.2) and the warning still persists.

Can you please help us identify if this is something related to the pipeline agent itself or Az Cli? It looks like doing some telemetry check. Also, I don’t see this kind of check when I test the operation locally on my devbox with --debug flag.

image

To Reproduce Run

az network front-door purge-endpoint --resource-group <RG-NAME> --name <NAME> --content-paths /* --debug

Expected behavior No warnings.

Environment summary Az Cli version: 2.24 front-door extension version: 1.0.14, 1.0.2 Pipeline agent pools are running Windows_NT

Additional context Link to error message within a pipeline failure: Link

ghost commented 3 years ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @cdnfdsuppgithub.

Issue Details
**Describe the bug** We have been noticing the following warning after upgrading from Az Cli 2.0.73 to 2.24 in our pipeline operation for purging AFD cache. It is causing the pipeline to fail. The operation depends on an extension of Az Cli called "front-door", although I have tried reverting the extension to original version (1.0.2) and the warning still persists. Can you please help us identify if this is something related to the pipeline agent itself or Az Cli? It looks like doing some telemetry check. Also, I don’t see this kind of check when I test the operation locally on my devbox with `--debug` flag. ![image](https://user-images.githubusercontent.com/40151439/120940509-7d25e780-c6d2-11eb-8359-bcf3b65b16ab.png) **To Reproduce** Run ``` az network front-door purge-endpoint --resource-group --name --content-paths /* --debug ``` **Expected behavior** No warnings. **Environment summary** Az Cli version: 2.24 front-door extension version: 1.0.14, 1.0.2 Pipeline agent pools are running Windows_NT **Additional context** Link to error message within a pipeline failure: [Link](https://microsoft.visualstudio.com/Universal%20Store/_build/results?buildId=35256069&view=logs&j=f10c73a8-624c-5e0a-9c96-493b90bf6928&t=778955af-38b5-5e60-e993-042d67602447&l=194)
Author: rajarshi-singh
Assignees: -
Labels: `Network - Front Door`, `Service Attention`
Milestone: -
yonzhan commented 3 years ago

route to service team

t-bzhan commented 3 years ago

@rajarshi-singh , the pipeline link you shared is not available. Also the warning seems related to azure CLI telemetry upload and not related to Azure Front Door extension: https://github.com/Azure/azure-cli/blob/dev/src/azure-cli-telemetry/azure/cli/telemetry/util.py#L35-L37 image

I am not sure whether this should be treated as a warning or not, @yonzhan could provide more details for that.

Anyway, if the warning is really an issue, you might consider to disable the telemetry collecting with:az configure -d collect_telemetry=false according to azure-cli-configuration

jsntcy commented 3 years ago

@rajarshi-singh We will discuss if it should be treated as a warning or not internally. As a workaround, you can disable the telemetry using az configure as @t-bzhan suggested.

cc @zhoxing-ms

wasker commented 3 years ago

@jsntcy The problem with az configure is that it's awkward in shared environments like Azure DevOps. Since telemetry is important only to Microsoft, why make it a warning for regular users?

zhoxing-ms commented 3 years ago

@wasker You can use the command az config set core.collect_telemetry=False to temporarily disable the Telemetry. It's warning is to tell users that the upload of Telemetry data is not real-time, and it is temporarily delayed due to insufficient interval. We will discuss if it should be treated as a warning or not internally.

zhoxing-ms commented 3 years ago

@rajarshi-singh I'd like to ask why the warning log causes the pipeline to fail? Since we cannot access the pipeline link, could you provide the specific error information of failed pipeline ? You can send it to me by email, my email address is Zhou.Xing@microsoft.com, thanks

rajarshi-singh commented 3 years ago

hello, the pipeline failed because of the default setting which causes tasks to fail on warnings or errors written to console. This was not a special configuration on our end. This should not be logged as a warning message. I wish I could provide you the pipeline link but it likely got expunged due to pipeline retention policies.

zhoxing-ms commented 3 years ago

@rajarshi-singh I'd like to ask if you set failOnStderr: true for the task of pipeline? If so and it's not necessary, you can set failOnStderr to false first

jiasli commented 3 years ago

I am the owner of Azure CLI's logging mechanism.

This WARNING

WARNING: telemetry.check: Negative: ...

is printed to stderr only if --debug is specified because this message belongs to telemetry.check logger.

https://github.com/Azure/azure-cli/blob/be601b7cfe542358f954e180b61b02c98fcfb27b/src/azure-cli-telemetry/azure/cli/telemetry/util.py#L22

telemetry.check logger is not printed in default mode (no --debug/--verbose). Here is the code responsible for the behavior:

https://github.com/microsoft/knack/blob/7dd953ae3f408e5923db212e0476893247f815ce/knack/log.py#L204-L207

            # (default)
            {
                'cli': logging.WARNING,
                'root': logging.CRITICAL,
            },
            # --verbose
            {
                'cli': logging.INFO,
                'root': logging.CRITICAL,
            },
            # --debug
            {
                'cli': logging.DEBUG,
                'root': logging.DEBUG,
            }]

In default mode, only cli.* loggers' WARNING messages are printed to stderr.

Because the command specifies the --debug flag:

az network front-door purge-endpoint ... --debug

with failOnStderr: true, the pipeline will fail regardless of this WARNING message. In other words, with failOnStderr: true, the pipeline fails on all DEBUG, INFO, WARNING, ERROR messages, not only WARNING or ERROR.

So the correct solution is to either

However, there are other commands that prints WARNINGs to stderr. This is the expected behavior. For more info, see https://github.com/Azure/azure-cli/issues/19314#issuecomment-917784104.

rajarshi-singh commented 3 years ago

@jiasli Thanks for clarifying and digging into the issue. I am not sure if it is necessarily true that the pipeline will fail for all types of messages (debug, info, warning) when failOnStderr: true . However there is a chance that I might have the --debug flag on during that pipeline run, it has been a while since I created this issue so I don't remember for sure.

Aside, we don't want to set failOnStderr: false because we'd ideally want to catch any other important issues. This flag is set for a lot of our other pipeline tasks and working ok.

jiasli commented 3 years ago

failOnStderr: true means as long as there is something printed to stderr, the pipeline fails:

https://docs.microsoft.com/en-us/azure/devops/pipelines/tasks/utility/bash?view=azure-devops#arguments

failOnStderr Fail on standard error (Optional) If this is true, this task will fail if any errors are written to stderr. Default value: false

It doesn't distinguish whether the log message is DEBUG, INFO, WARNING, ERROR. These are Azure CLI's own classification.

If you want to use --debug which prints all log messages to stderr, failOnStderr must be set to false. In other words, --debug and failOnStderr are mutually exclusive.

The correct way to detect if a CLI command fails is to check the exit code. In Bash, you can utilize set -e which automatically fails the script when a CLI command fails:

https://www.gnu.org/software/bash/manual/html_node/The-Set-Builtin.html

-e Exit immediately if a pipeline (see Pipelines), which may consist of a single simple command (see Simple Commands), a list (see Lists), or a compound command (see Compound Commands) returns a non-zero status.

rajarshi-singh commented 3 years ago

Thanks for additional info, good to know about stderr behavior with regards to all types of CLI messages. We do check for the cli command exit code as well.

jiasli commented 1 year ago

(Copied from https://github.com/Azure/azure-cli/issues/19314#issuecomment-917784104)

According to https://www.gnu.org/software/libc/manual/html_node/Standard-Streams.html

Variable: FILE * stderr The standard error stream, which is used for error messages and diagnostics issued by the program.

Therefore, stderr is the right place for Azure CLI to write WARNING messages.

If the pipeline has failOnStderr set to true and the WARNING message is causing failure, you may either

  1. In the pipeline, set failOnStderr to false.
  2. Disable WARNING messages in Azure CLI commands using the --only-show-errors flag or only_show_errors config option: https://learn.microsoft.com/en-us/cli/azure/azure-cli-configuration#cli-configuration-values-and-environment-variables
DennyWebb commented 1 year ago

Warning type output for warnings, rather than error type, seems so much more logical here... while it could be concieve as "diagnostics"... the use of diagnostics in that context of the documentation is meant to be diagnostics issued by the program in regards to an error... not general diagnostics.

jiasli commented 1 year ago

There is no equivalent of set -e from Bash in PowerShell <7.3 for native commands (az.cmd), so $ErrorActionPreference = "Stop" doesn't work even if az fails with non-zero exit code. I think that's why users frequently turn on failOnStderr as a workaround, even though that workaround is not valid at all.

The correct solutions are

See