argoproj / argo-workflows

Workflow Engine for Kubernetes
https://argo-workflows.readthedocs.io/
Apache License 2.0
14.65k stars 3.14k forks source link

trivial - reduce log spamming about artifact-repositories #9085

Open tooptoop4 opened 2 years ago

tooptoop4 commented 2 years ago

I see this message printed in my argo-workflows-workflow-controller pod logs about 15 times every hour: time="2022-06-30T11:03:40.245Z" level=warning msg="Non-transient error: configmaps \"artifact-repositories\" not found"

perhaps controller can change to only look for the configmap on startup of the pod?

argo version is 3.3.8

https://github.com/argoproj/argo-workflows/blob/v3.4.7/workflow/artifactrepositories/artifactrepositories.go#L89

sarabala1979 commented 2 years ago

@dpadhiar can you take a look?

alexec commented 2 years ago

@sarabala1979 I think this is low impact and therefore low priority. I don't think it is a bug either, tech debt.

dpadhiar commented 2 years ago

@sarabala1979 @alexec Going to mark as tech debt, Bala wanted to just quickly make this log a debug statement instead of a warning.

terrytangyuan commented 2 years ago

For other transient errors (not related to artifact-repositories configmap), this info is useful for users to know what is being retried so I am not sure if we want to make it debug level.

dpadhiar commented 2 years ago

For other transient errors (not related to artifact-repositories configmap), this info is useful for users to know what is being retried so I am not sure if we want to make it debug level.

In that case, we can leave as is and come to another solution if necessary.

tooptoop4 commented 1 year ago

not sure why but in v3.4.8 this line is being printed every few seconds!

thor commented 1 month ago

Could it make sense that the IsTransient function does not log non-transient errors? Given that transient errors yield retries, it seems reasonable to try again. However, a non-transient error doesn't warrant retrying. If so, it could let the caller propagate the error properly further up in the stack by silently proceeding.

Either way, I believe it makes sense to keep the logging for the transient errors, @terrytangyuan. It seems that this issue is specifically for the non-transient issue of "damn, there's no config map! oh well, default it is!".

Would a PR making transient errors silent be welcome, @alexec?

Alternatively, there could be a separate IsTransient that's silent.

terrytangyuan commented 1 month ago

This is the only case so I am fine with even hard-coding it to silence this specific error