tektoncd / triggers

Event triggering with Tekton!
Apache License 2.0
553 stars 417 forks source link

Interceptor Timeout #1452

Open dibyom opened 1 year ago

dibyom commented 1 year ago

Discussed in https://github.com/tektoncd/triggers/discussions/1451

Originally posted by **joshua-blickensdoerfer** September 29, 2022 Hello, i have written a custom Cluster interceptor. In the eventlistener log i can see that there is a timeout after a few seconds "logger":"eventlistener","caller":"sink/sink.go:381","msg":"Post \"http://custom-svc.tekton-framework.svc:80\": net/http: timeout awaiting response headers" Is there any way to increase the timeout duration for custom interceptors? I've tried increasing the value of "'-el-readtimeout" and "-el-httpclient-readtimeout" this did not seem to have any effect. Kind regards Joshua
jmcshane commented 1 year ago

Right now, the way this could be done on the eventlistener would be to add custom args for overriding the defaults for the http client. We could build an HTTP client for each cluster interceptor, this could simplify the default EL http client construction as we have to assemble the full tls config for all the interceptors on startup right now in https://github.com/tektoncd/triggers/blob/v0.21.0/pkg/adapter/adapter.go#L124 and keep a watch on it to continually update.

I could see a clusterinterceptorspec like:

kind: ClusterInterceptor
...
spec:
  timeouts:
    tlshandshake:
    responseheader:
    expectcontinuetimeout: 
    readtimeout:
    keepalive:

Obviously, these could all be optional values so we can distinguish being unset vs set to 0, but I'm thinking about the "default" behavior here. Would nil mean "default to the current eventlistener value" vs 0 meaning "no timeout"? Are we concerned about the penalty for rebuilding the interceptor httpclient on every interceptor call?

dibyom commented 1 year ago

Would nil mean "default to the current eventlistener value" vs 0 meaning "no timeout"?

Yeah I think that makes sense

Are we concerned about the penalty for rebuilding the interceptor httpclient on every interceptor call?

I think so 😬 do we need to build the interceptor on each call? can we do it periodically or when needed if a interceptor changes?

(Doesn't help with timeouts but for certs at least we could provide tls.Config's GetCertificate similar to how knative/pkg's webhook implementation does:)

jmcshane commented 1 year ago

I think so 😬 do we need to build the interceptor on each call? can we do it periodically or when needed if an interceptor changes?

Yeah, that was my presumption as well. Let me take a look at how the interceptor watch works and see if we can keep these clients somewhere reasonable and just update on watches

tekton-robot commented 1 year ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale with a justification. Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with /close with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

tekton-robot commented 1 year ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten with a justification. Rotten issues close after an additional 30d of inactivity. If this issue is safe to close now please do so with /close with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

tekton-robot commented 1 year ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen with a justification. Mark the issue as fresh with /remove-lifecycle rotten with a justification. If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

tekton-robot commented 1 year ago

@tekton-robot: Closing this issue.

In response to [this](https://github.com/tektoncd/triggers/issues/1452#issuecomment-1452599203): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen` with a justification. >Mark the issue as fresh with `/remove-lifecycle rotten` with a justification. >If this issue should be exempted, mark the issue as frozen with `/lifecycle frozen` with a justification. > >/close > >Send feedback to [tektoncd/plumbing](https://github.com/tektoncd/plumbing). Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
khrm commented 1 year ago

/remove-lifecycle rotten

/lifecycle-frozen

We will handle this in future releases.

khrm commented 1 year ago

/reopen /lifecycle frozen

tekton-robot commented 1 year ago

@khrm: Reopened this issue.

In response to [this](https://github.com/tektoncd/triggers/issues/1452#issuecomment-1452790982): >/reopen >/lifecycle frozen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.