[BUG] KNative serving Activator throttler failures related to GCP Pubsub backed Broker

Problem: I have a simple setup with a container source connected to a GCP pullsub backed broker itself forwarding message to a simple sequence composed of a single service.

Everything works at first: an event sent from the source is handled correctly by the sequence/service. But after some times (typical scenario is a few days such as a weekend without any traffic), events fail to be sent by the containersource. Everytime an event is failed to be sent, we can see a failure in the knative-serving activator logs about throttling errors. Recreating activator pods make it work again usually. Only GCP brokers (or channels) seem to be affected:

{"level":"info","ts":"2019-11-19T05:55:36.789Z","logger":"activator","caller":"net/throttler.go:244","msg":"Updating Revision Throttler with: clusterIP = \"\", trackers = 0, backends = 0 activator pos 0/1","commit":"a1a35b1","knative.dev/controller":"activator","knative.dev/pod":"activator-684b89fbc8-6bqsd","knative.dev/key":"my-namespace/cre-my-broker-gcp-kne-trigger-chan-publish-dbclv"}
{"level":"info","ts":"2019-11-19T05:55:36.790Z","logger":"activator","caller":"net/throttler.go:230","msg":"Set capacity to 0 (backends: 0, index: 0/1)","commit":"a1a35b1","knative.dev/controller":"activator","knative.dev/pod":"activator-684b89fbc8-6bqsd","knative.dev/key":"my-namespace/cre-my-broker-gcp-kne-trigger-chan-publish-dbclv"}
{"level":"info","ts":"2019-11-19T05:55:36.790Z","logger":"activator","caller":"net/throttler.go:244","msg":"Updating Revision Throttler with: clusterIP = \"\", trackers = 0, backends = 0 activator pos 0/1","commit":"a1a35b1","knative.dev/controller":"activator","knative.dev/pod":"activator-684b89fbc8-6bqsd","knative.dev/key":"my-namespace/cre-my-broker-gcp-kne-ingress-chan-publish-k56xt"}
{"level":"info","ts":"2019-11-19T05:55:36.790Z","logger":"activator","caller":"net/throttler.go:230","msg":"Set capacity to 0 (backends: 0, index: 0/1)","commit":"a1a35b1","knative.dev/controller":"activator","knative.dev/pod":"activator-684b89fbc8-6bqsd","knative.dev/key":"my-namespace/cre-my-broker-gcp-kne-ingress-chan-publish-k56xt"}
{"level":"info","ts":"2019-11-19T10:13:33.091Z","logger":"activator","caller":"net/throttler.go:633","msg":"Context is closed: context deadline exceeded","commit":"a1a35b1","knative.dev/controller":"activator","knative.dev/pod":"activator-684b89fbc8-6bqsd","knative.dev/key":"my-namespace/cre-my-broker-gcp-kne-trigger-chan-publish-dbclv"}
{"level":"error","ts":"2019-11-19T10:13:33.091Z","logger":"activator","caller":"handler/handler.go:123","msg":"Throttler try error","commit":"a1a35b1","knative.dev/controller":"activator","knative.dev/pod":"activator-684b89fbc8-6bqsd","knative.dev/key":"my-namespace/cre-my-broker-gcp-kne-trigger-chan-publish-dbclv","error":"context deadline exceeded","stacktrace":"knative.dev/serving/pkg/activator/handler.(*activationHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/handler.go:123\nknative.dev/serving/pkg/activator/handler.(*RequestEventHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/requestevent_handler.go:67\nknative.dev/serving/vendor/go.opencensus.io/plugin/ochttp.(*Handler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/vendor/go.opencensus.io/plugin/ochttp/server.go:86\nknative.dev/serving/pkg/activator/config.(*storeMiddleware).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/config/store.go:83\nknative.dev/serving/pkg/http.(*RequestLogHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/http/request_log.go:135\nknative.dev/serving/pkg/activator/handler.(*ProbeHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/probe_handler.go:41\nknative.dev/serving/pkg/activator/handler.(*HealthHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/healthz_handler.go:38\nknative.dev/serving/pkg/network.(*handler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/network/probe_handler.go:40\nknative.dev/serving/pkg/activator/handler.(*MetricHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/metric_handler.go:91\nknative.dev/serving/vendor/golang.org/x/net/http2/h2c.h2cHandler.ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/vendor/golang.org/x/net/http2/h2c/h2c.go:98\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2802\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1890"}
{"level":"info","ts":"2019-11-19T10:14:15.145Z","logger":"activator","caller":"net/throttler.go:633","msg":"Context is closed: context deadline exceeded","commit":"a1a35b1","knative.dev/controller":"activator","knative.dev/pod":"activator-684b89fbc8-6bqsd","knative.dev/key":"my-namespace/cre-my-broker-gcp-kne-trigger-chan-publish-dbclv"}
{"level":"error","ts":"2019-11-19T10:14:15.145Z","logger":"activator","caller":"handler/handler.go:123","msg":"Throttler try error","commit":"a1a35b1","knative.dev/controller":"activator","knative.dev/pod":"activator-684b89fbc8-6bqsd","knative.dev/key":"my-namespace/cre-my-broker-gcp-kne-trigger-chan-publish-dbclv","error":"context deadline exceeded","stacktrace":"knative.dev/serving/pkg/activator/handler.(*activationHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/handler.go:123\nknative.dev/serving/pkg/activator/handler.(*RequestEventHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/requestevent_handler.go:67\nknative.dev/serving/vendor/go.opencensus.io/plugin/ochttp.(*Handler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/vendor/go.opencensus.io/plugin/ochttp/server.go:86\nknative.dev/serving/pkg/activator/config.(*storeMiddleware).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/config/store.go:83\nknative.dev/serving/pkg/http.(*RequestLogHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/http/request_log.go:135\nknative.dev/serving/pkg/activator/handler.(*ProbeHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/probe_handler.go:41\nknative.dev/serving/pkg/activator/handler.(*HealthHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/healthz_handler.go:38\nknative.dev/serving/pkg/network.(*handler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/network/probe_handler.go:40\nknative.dev/serving/pkg/activator/handler.(*MetricHandler).ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/pkg/activator/handler/metric_handler.go:91\nknative.dev/serving/vendor/golang.org/x/net/http2/h2c.h2cHandler.ServeHTTP\n\t/home/prow/go/src/knative.dev/serving/vendor/golang.org/x/net/http2/h2c/h2c.go:98\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2802\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1890"}

Here is a gist to help setup the failing case: https://gist.github.com/ratamovic/44f3aee5b6e91882071b6d2822686e72 (please change what's appropriate in it). It creates a container source (a python script), two brokers (in-mem and GCP pullsub), a sequence composed of a single pod (a simple flask service). You can run it by:

adapting what's necessary (project name, gcp key, serviceaccount if necessary)
running ./failure-setup.sh create_cluster to create a new cluster "prone to failure" (but maybe any cluster might have the same issue).
running ./failure-setup.sh create_docker to create docker images for the container source and the sequence service
running ./failure-setup.sh setup_my_namespace to setup the namespace with source, sequence, services and brokers
running ./failure-setup.sh send_event_mem to send an event through in-memory broker (which works) as many times as you want (this is a reference that shows that the whole system works well with in-memory broker)
running ./failure-setup.sh send_event_gcp to send an event through GCP pull sub back broker (which works but then fails after a few days) as many times as you want.

KNative serving version is 0.10.0, eventing is 0.10.1 and Knative-GCP is 0.10.1 (previous versions such as 0.9 are affected too). This issue has been a bit discussed on KNative Slack (https://knative.slack.com/archives/C9JP909F0/p1572864600066700)

google / knative-gcp

[BUG] KNative serving Activator throttler failures related to GCP Pubsub backed Broker #415