google / knative-gcp

GCP event implementations to use with Knative Eventing.
https://github.com/knative/eventing
Apache License 2.0
160 stars 74 forks source link

[BUG] KNative serving Activator throttler failures related to GCP Pubsub backed Broker #415

Closed rtmvc closed 4 years ago

rtmvc commented 4 years ago

Problem: I have a simple setup with a container source connected to a GCP pullsub backed broker itself forwarding message to a simple sequence composed of a single service.

Everything works at first: an event sent from the source is handled correctly by the sequence/service. But after some times (typical scenario is a few days such as a weekend without any traffic), events fail to be sent by the containersource. Everytime an event is failed to be sent, we can see a failure in the knative-serving activator logs about throttling errors. Recreating activator pods make it work again usually. Only GCP brokers (or channels) seem to be affected:

{"level":"info","ts":"2019-11-19T05:55:36.789Z","logger":"activator","caller":"net/throttler.go:244","msg":"Updating Revision Throttler with: clusterIP = \"\", trackers = 0, backends = 0 activator pos 0/1","commit":"a1a35b1","knative.dev/controller":"activator","knative.dev/pod":"activator-684b89fbc8-6bqsd","knative.dev/key":"my-namespace/cre-my-broker-gcp-kne-trigger-chan-publish-dbclv"}
{"level":"info","ts":"2019-11-19T05:55:36.790Z","logger":"activator","caller":"net/throttler.go:230","msg":"Set capacity to 0 (backends: 0, index: 0/1)","commit":"a1a35b1","knative.dev/controller":"activator","knative.dev/pod":"activator-684b89fbc8-6bqsd","knative.dev/key":"my-namespace/cre-my-broker-gcp-kne-trigger-chan-publish-dbclv"}
{"level":"info","ts":"2019-11-19T05:55:36.790Z","logger":"activator","caller":"net/throttler.go:244","msg":"Updating Revision Throttler with: clusterIP = \"\", trackers = 0, backends = 0 activator pos 0/1","commit":"a1a35b1","knative.dev/controller":"activator","knative.dev/pod":"activator-684b89fbc8-6bqsd","knative.dev/key":"my-namespace/cre-my-broker-gcp-kne-ingress-chan-publish-k56xt"}
{"level":"info","ts":"2019-11-19T05:55:36.790Z","logger":"activator","caller":"net/throttler.go:230","msg":"Set capacity to 0 (backends: 0, index: 0/1)","commit":"a1a35b1","knative.dev/controller":"activator","knative.dev/pod":"activator-684b89fbc8-6bqsd","knative.dev/key":"my-namespace/cre-my-broker-gcp-kne-ingress-chan-publish-k56xt"}
{"level":"info","ts":"2019-11-19T10:13:33.091Z","logger":"activator","caller":"net/throttler.go:633","msg":"Context is closed: context deadline exceeded","commit":"a1a35b1","knative.dev/controller":"activator","knative.dev/pod":"activator-684b89fbc8-6bqsd","knative.dev/key":"my-namespace/cre-my-broker-gcp-kne-trigger-chan-publish-dbclv"}
{"level":"error","ts":"2019-11-19T10:13:33.091Z","logger":"activator","caller":"handler/handler.go:123","msg":"Throttler try error","commit":"a1a35b1","knative.dev/controller":"activator","knative.dev/pod":"activator-684b89fbc8-6bqsd","knative.dev/key":"my-namespace/cre-my-broker-gcp-kne-trigger-chan-publish-dbclv","error":"context deadline exceeded","stacktrace":"knative.dev/serving/pkg/activator/handler.(*activationHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/handler.go:123\nknative.dev/serving/pkg/activator/handler.(*RequestEventHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/requestevent_handler.go:67\nknative.dev/serving/vendor/go.opencensus.io/plugin/ochttp.(*Handler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/vendor/go.opencensus.io/plugin/ochttp/server.go:86\nknative.dev/serving/pkg/activator/config.(*storeMiddleware).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/config/store.go:83\nknative.dev/serving/pkg/http.(*RequestLogHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/http/request_log.go:135\nknative.dev/serving/pkg/activator/handler.(*ProbeHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/probe_handler.go:41\nknative.dev/serving/pkg/activator/handler.(*HealthHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/healthz_handler.go:38\nknative.dev/serving/pkg/network.(*handler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/network/probe_handler.go:40\nknative.dev/serving/pkg/activator/handler.(*MetricHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/metric_handler.go:91\nknative.dev/serving/vendor/golang.org/x/net/http2/h2c.h2cHandler.ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/vendor/golang.org/x/net/http2/h2c/h2c.go:98\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2802\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1890"}
{"level":"info","ts":"2019-11-19T10:14:15.145Z","logger":"activator","caller":"net/throttler.go:633","msg":"Context is closed: context deadline exceeded","commit":"a1a35b1","knative.dev/controller":"activator","knative.dev/pod":"activator-684b89fbc8-6bqsd","knative.dev/key":"my-namespace/cre-my-broker-gcp-kne-trigger-chan-publish-dbclv"}
{"level":"error","ts":"2019-11-19T10:14:15.145Z","logger":"activator","caller":"handler/handler.go:123","msg":"Throttler try error","commit":"a1a35b1","knative.dev/controller":"activator","knative.dev/pod":"activator-684b89fbc8-6bqsd","knative.dev/key":"my-namespace/cre-my-broker-gcp-kne-trigger-chan-publish-dbclv","error":"context deadline exceeded","stacktrace":"knative.dev/serving/pkg/activator/handler.(*activationHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/handler.go:123\nknative.dev/serving/pkg/activator/handler.(*RequestEventHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/requestevent_handler.go:67\nknative.dev/serving/vendor/go.opencensus.io/plugin/ochttp.(*Handler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/vendor/go.opencensus.io/plugin/ochttp/server.go:86\nknative.dev/serving/pkg/activator/config.(*storeMiddleware).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/config/store.go:83\nknative.dev/serving/pkg/http.(*RequestLogHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/http/request_log.go:135\nknative.dev/serving/pkg/activator/handler.(*ProbeHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/probe_handler.go:41\nknative.dev/serving/pkg/activator/handler.(*HealthHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/healthz_handler.go:38\nknative.dev/serving/pkg/network.(*handler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/network/probe_handler.go:40\nknative.dev/serving/pkg/activator/handler.(*MetricHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/metric_handler.go:91\nknative.dev/serving/vendor/golang.org/x/net/http2/h2c.h2cHandler.ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/vendor/golang.org/x/net/http2/h2c/h2c.go:98\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2802\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1890"}

Here is a gist to help setup the failing case: https://gist.github.com/ratamovic/44f3aee5b6e91882071b6d2822686e72 (please change what's appropriate in it). It creates a container source (a python script), two brokers (in-mem and GCP pullsub), a sequence composed of a single pod (a simple flask service). You can run it by:

KNative serving version is 0.10.0, eventing is 0.10.1 and Knative-GCP is 0.10.1 (previous versions such as 0.9 are affected too). This issue has been a bit discussed on KNative Slack (https://knative.slack.com/archives/C9JP909F0/p1572864600066700)

rtmvc commented 4 years ago

In fact not only GCP seems affected on my setup, though it happens more often and quickly with it usually... I'm trying to open an issue on knative serving.

rtmvc commented 4 years ago

I'll close it since it seems more about Serving activator