Closed mpeyrard closed 3 years ago
After playing around with this some more, it looks like I can't delete and then re-publish another trigger with the same name. The broker filter never forgets the original, and never updates it with the new one. It's probably relevant to mention that I was managing my K service with helm. And as I was doing a lot of testing and debugging, I was doing a lot of helm delete --purge
commands followed by helm install
commands. Because the trigger is part of my chart, the trigger was being uninstalled by helm every time I deleted the chart.
@MPeyrard86 thanks for the report and sorry you're having trouble. Have you tried reproducing this with a more recent version than 0.14.1?
Not yet. We plan on upgrading in the near future when we upgrade our version of Kubernetes.
@MPeyrard86 would you be willing to provide a bit more yaml so that we could more easily reproduce your setup? Have you tried eventing release v0.14.2?
An alternative might be to try running the conformance tests on your currently setup broker, which can be done by running in a similar manner to e2e tests:
cd $GOPATH/knative.dev/eventing/test/conformance && go test -race -count=1 -tags=e2e -timeout=20m -brokernName=YOURBORKERNAMEHERE -brokerNamespace=NAMESPACHERE -run TestBrokerV1Beta1DataPlaneMetrics
@lberk When I run this command, it tells me that there are no tests to run. I assume I did something wrong.
testing: warning: no tests to run
PASS
ok knative.dev/eventing/test/conformance 1.641s
We will likely upgrade to the latest knative version once we upgrade our version of Kubernetes.
I've also done some more digging, and it seems like deleting triggers is very problematic. Even when I delete them via kubectl
, not just via helm
. The filter pod keeps logging errors/warnings that reference old triggers that have long since been deleted:
{"level":"warn","ts":1595973899.4984841,"logger":"fallback","caller":"http/transport.go:532","msg":"got an error from receiver fn","error":"trigger.eventing.knative.dev \"re-router-trigger-10\" not found"}
{"level":"warn","ts":1595973899.4975579,"logger":"fallback","caller":"http/transport.go:532","msg":"got an error from receiver fn","error":"trigger.eventing.knative.dev \"re-router-trigger-10\" not found"}
{"level":"warn","ts":1595973899.4987288,"logger":"fallback","caller":"http/transport.go:624","msg":"error returned from invokeReceiver","error":"trigger.eventing.knative.dev \"re-router-trigger-10\" not found"}
{"level":"warn","ts":1595973899.4988537,"logger":"fallback","caller":"http/transport.go:624","msg":"error returned from invokeReceiver","error":"trigger.eventing.knative.dev \"re-router-trigger-10\" not found"}
{"level":"warn","ts":1595973899.4979901,"logger":"fallback","caller":"http/transport.go:532","msg":"got an error from receiver fn","error":"trigger.eventing.knative.dev \"re-router-trigger-10\" not found"}
{"level":"warn","ts":1595973899.4992394,"logger":"fallback","caller":"http/transport.go:624","msg":"error returned from invokeReceiver","error":"trigger.eventing.knative.dev \"re-router-trigger-10\" not found"}
{"level":"warn","ts":1595973899.498132,"logger":"fallback","caller":"http/transport.go:532","msg":"got an error from receiver fn","error":"trigger.eventing.knative.dev \"re-router-trigger-10\" not found"}
{"level":"warn","ts":1595973899.4996395,"logger":"fallback","caller":"http/transport.go:624","msg":"error returned from invokeReceiver","error":"trigger.eventing.knative.dev \"re-router-trigger-10\" not found"}
{"level":"warn","ts":1595973899.498285,"logger":"fallback","caller":"http/transport.go:532","msg":"got an error from receiver fn","error":"trigger.eventing.knative.dev \"re-router-trigger-10\" not found"}
Rebooting the broker pods does not resolve this issue. Furthermore, physically rebooting the hardware that Kubernetes is running on also did not resolve this issue. It still thinks these triggers exist. Very confusing.
I'm going to continue investigating, but as of right now, it would appear that once I've deleted a trigger, events no longer propagate within the system, and we start seeing errors in the broker's filter logs. I currently cannot say if it's a single delete or a series of deletes (maybe something with a random chance of happening?), but we end up in this state where nothing in knative eventing works anymore, and we need to restore snapshots of our k8s images and re-install everything.
Furthermore, physically rebooting the hardware that Kubernetes is running on also did not resolve this issue.
If it persists across reboots, it must be in etcd. This suggests to me that the issue is related to some resource that's not being cleaned up properly.
Just to reduce the size of the uncertainty cone, can you tell us which broker class you're using? An easy way to tell is by checking for an annotation on the Broker like eventing.knative.dev/broker.class: MTChannelBasedBroker
. What's the value of that annotation?
Something to try in the meantime:
If deleting the trigger in 2 didn't correctly revert the changes you observed in 1, that's likely related to the issue.
We are currently using the ChannelBasedBroker
. And I can confirm that after deleting a trigger, the subscription for that trigger still exists in the channel spec. Its Generation
was incremented from 1 to 2, so it did recognize that something happened to that resource.
Name: rules-engine-kne-trigger
Namespace: default
Labels: eventing.knative.dev/broker=rules-engine
eventing.knative.dev/brokerEverything=true
Annotations: <none>
API Version: messaging.knative.dev/v1alpha1
Kind: KafkaChannel
Metadata:
Creation Timestamp: 2020-07-29T02:05:29Z
Finalizers:
kafkachannels.messaging.knative.dev
Generation: 7
Owner References:
API Version: eventing.knative.dev/v1alpha1
Block Owner Deletion: true
Controller: true
Kind: Broker
Name: rules-engine
UID: 193741c0-0051-4210-aca6-508d0d9efaa8
Resource Version: 19072
Self Link: /apis/messaging.knative.dev/v1alpha1/namespaces/default/kafkachannels/rules-engine-kne-trigger
UID: 7d0bcbe7-b2df-4f71-a2a4-f5859d86cec0
Spec:
Num Partitions: 20
Replication Factor: 1
Subscribable:
Subscribers:
Generation: 2
Reply URI: http://rules-engine-broker.default.svc.cluster.local
Subscriber URI: http://rules-engine-broker-filter.default.svc.cluster.local/triggers/default/re-router-trigger-final-2/cb9245e1-e741-4d15-a9b4-f06a7505cfe9
UID: bfa65e22-a22a-44df-8821-dcc07140e54f
Generation: 1
Reply URI: http://rules-engine-broker.default.svc.cluster.local
Subscriber URI: http://rules-engine-broker-filter.default.svc.cluster.local/triggers/default/re-router-trigger-final/92848617-d438-4698-8d8b-a9e00e16327f
UID: aae44d10-cd54-4a9e-9c6c-966c5e529405
Generation: 1
Reply URI: http://rules-engine-broker.default.svc.cluster.local
Subscriber URI: http://rules-engine-broker-filter.default.svc.cluster.local/triggers/default/re-email-trigger-final-2/b36857d6-a69d-4cb5-a213-505836f1ca31
UID: f33d6781-bc93-4009-9ca6-d34f44afe011
Generation: 1
Reply URI: http://rules-engine-broker.default.svc.cluster.local
Subscriber URI: http://rules-engine-broker-filter.default.svc.cluster.local/triggers/default/re-email-trigger-final/0d879cb1-60af-4d31-a623-1172b65e5822
UID: 832d2ad5-977f-44bb-94f2-cb2d08aa2da0
I assume the deleted Trigger is the one corresponding to the Subscription with UID bfa65e22-a22a-44df-8821-dcc07140e54f
. Does the Subscription object with that UID still exist or was it deleted along with the Trigger?
Here's a 3rd debugging step: Create the trigger again with the same name. What changes in the Channel spec?
I confirmed that the Subscription
resources are deleted when the triggers are deleted. However, the KafkaChannel
does not look like it's in a good state after deleting, and especially after re-adding the triggers:
$ kubectl describe kafkachannel.messaging.knative.dev/rules-engine-kne-trigger
Name: rules-engine-kne-trigger
Namespace: default
Labels: eventing.knative.dev/broker=rules-engine
eventing.knative.dev/brokerEverything=true
Annotations: <none>
API Version: messaging.knative.dev/v1alpha1
Kind: KafkaChannel
Metadata:
Creation Timestamp: 2020-07-29T04:16:17Z
Finalizers:
kafkachannels.messaging.knative.dev
Generation: 8
Owner References:
API Version: eventing.knative.dev/v1alpha1
Block Owner Deletion: true
Controller: true
Kind: Broker
Name: rules-engine
UID: 28065a28-61a4-44b6-8a51-c3c4b2a81cc9
Resource Version: 329567
Self Link: /apis/messaging.knative.dev/v1alpha1/namespaces/default/kafkachannels/rules-engine-kne-trigger
UID: 79d4914c-0b75-46a0-80a9-701ef754ee25
Spec:
Num Partitions: 20
Replication Factor: 1
Subscribable:
Subscribers:
Generation: 2
Reply URI: http://rules-engine-broker.default.svc.cluster.local
Subscriber URI: http://rules-engine-broker-filter.default.svc.cluster.local/triggers/default/re-email-trigger-final/6061efb7-8a04-450e-b325-0f41030dcece
UID: 3b694fd7-1d39-4bf6-88b2-8914ec98e86b
Generation: 2
Reply URI: http://rules-engine-broker.default.svc.cluster.local
Subscriber URI: http://rules-engine-broker-filter.default.svc.cluster.local/triggers/default/re-router-trigger-final/7f6d8423-70bd-4ce0-a6b0-5afd25fb9eed
UID: 9843c958-ab5b-4794-b3ee-0c19cb7f4315
Generation: 1
Reply URI: http://rules-engine-broker.default.svc.cluster.local
Subscriber URI: http://rules-engine-broker-filter.default.svc.cluster.local/triggers/default/re-email-trigger-final/ac9b6f31-c215-44cf-91ea-9f735866b2dc
UID: e29816b3-1567-4758-8ab3-6b207e1063c0
Generation: 1
Reply URI: http://rules-engine-broker.default.svc.cluster.local
Subscriber URI: http://rules-engine-broker-filter.default.svc.cluster.local/triggers/default/re-router-trigger-final/0466ea4b-02ef-4a7f-b4d7-a9d57c77df29
UID: 43526bea-5205-4d22-b0dc-08d110746240
Status:
Address:
Hostname: rules-engine-kne-trigger-kn-channel.default.svc.cluster.local
URL: http://rules-engine-kne-trigger-kn-channel.default.svc.cluster.local
Conditions:
Last Transition Time: 2020-07-29T04:21:58Z
Status: True
Type: Addressable
Last Transition Time: 2020-07-29T04:21:58Z
Status: True
Type: ChannelServiceReady
Last Transition Time: 2020-07-29T04:21:53Z
Status: True
Type: ConfigurationReady
Last Transition Time: 2020-07-29T12:52:33Z
Status: True
Type: DispatcherReady
Last Transition Time: 2020-07-29T04:22:18Z
Status: True
Type: EndpointsReady
Last Transition Time: 2020-07-29T12:52:33Z
Status: True
Type: Ready
Last Transition Time: 2020-07-29T04:21:53Z
Status: True
Type: ServiceReady
Last Transition Time: 2020-07-29T04:21:53Z
Status: True
Type: TopicReady
Subscribable Status:
Subscribers:
Observed Generation: 2
Ready: True
UID: 3b694fd7-1d39-4bf6-88b2-8914ec98e86b
Observed Generation: 2
Ready: True
UID: 9843c958-ab5b-4794-b3ee-0c19cb7f4315
Observed Generation: 1
Ready: True
UID: e29816b3-1567-4758-8ab3-6b207e1063c0
Observed Generation: 1
Ready: True
UID: 43526bea-5205-4d22-b0dc-08d110746240
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal KafkaChannelReconciled 3s (x34 over 32h) kafkachannel-controller KafkaChannel reconciled: "default/rules-engine-kne-trigger"
Normal ChannelReconciled 3s (x12 over 24h) kafka-ch-dispatcher KafkaChannel reconciled
To summarize: It shows four subscribers in the channel spec, when in fact there are only two. Those that are on Generation: 2
were deleted. Those that are at Generation: 1
were the same triggers that were re-added after they were deleted.
@grantr So it sounds like a KafkaChannel
specific error, then? Should I log another bug under the contrib project?
@MPeyrard86 yes please!
Seems like a KafkaChannel-specific error, or possibly a Subscription controller error. The Broker seems to be operating correctly but something in the KafkaChannel controller or the Subscription controller is not working properly.
@slinkydeveloper Can you move this issue to eventing-contrib (or whatever repo Kafka Channel lives in these days)?
@vaikas you might try reproing this with IMC to check if it's a subscription controller issue.
Ok, so I've been playing with this from the head with the following set up:
Broker:
kubectl create -f - <<EOF
apiVersion: eventing.knative.dev/v1
kind: Broker
metadata:
name: broker3
namespace: vaikas-test
EOF
PingSource:
kubectl create -f - <<EOF
apiVersion: sources.knative.dev/v1beta1
kind: PingSource
metadata:
name: test-ping-source
namespace: vaikas-test
spec:
schedule: "*/1 * * * *"
jsonData: '{"message": "Hello world!"}'
sink:
ref:
apiVersion: eventing.knative.dev/v1
kind: Broker
name: broker3
EOF
And two functions:
kubectl create -f - <<EOF
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: event-display
namespace: vaikas-test
spec:
template:
spec:
containers:
- image: gcr.io/knative-releases/knative.dev/eventing-contrib/cmd/event_display@sha256:a214514d6ba674d7393ec8448dd272472b2956207acb3f83152d3071f0ab1911
EOF
and:
kubectl create -f - <<EOF
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: event-display-2
namespace: vaikas-test
spec:
template:
spec:
containers:
- image: gcr.io/knative-releases/knative.dev/eventing-contrib/cmd/event_display@sha256:a214514d6ba674d7393ec8448dd272472b2956207acb3f83152d3071f0ab1911
EOF
And by creating an initial trigger like this:
kubectl create -f - <<EOF
apiVersion: eventing.knative.dev/v1
kind: Trigger
metadata:
name: trigger-2
namespace: vaikas-test
spec:
broker: broker3
subscriber:
ref:
apiVersion: serving.knative.dev/v1
kind: Service
name: event-display
EOF
Then I've done things like:
So, the TL;DR I can't seem to be able to repro this (or I'm trying to repro it incorrectly). Of course this is from the head, so maybe something was fixed along the way and it was broken in the version you are running. If it's not a huge pain, @MPeyrard86 could you try running with IMC instead of Kafka channel or alternatively, @slinkydeveloper could you try to repro with Kafka channels from the head to see if it's a bug?
Thanks for looking into it! Yeah, I'll give it a try with IMC. And yes, I think it's probably a KafkaChannel issue. We've been seeing some stability issues with the Kafka Channel. Since logging the bug, we've also seen this issue coincide with the kafka-ch-dispatcher
going into a crash loop. But not always. I'll get back to you when I've done some tests with the IMC.
superduper, thanks much and sorry for the troubles :(
Just checking if you have had any luck trying to repro this. @slinkydeveloper any luck on kafka? Is there an issue we could link from here?
Hi @vaikas OK I was able to reproduce this using the IMC, as well.
After adding and deleting a trigger using the in-memory channel, I see this:
$ kubectl describe imc default
Name: default-kne-trigger
Namespace: default
Labels: eventing.knative.dev/broker=default
eventing.knative.dev/brokerEverything=true
Annotations: messaging.knative.dev/creator: system:serviceaccount:knative-eventing:eventing-controller
messaging.knative.dev/lastModifier: system:serviceaccount:knative-eventing:eventing-controller
messaging.knative.dev/subscribable: v1beta1
API Version: messaging.knative.dev/v1beta1
Kind: InMemoryChannel
Metadata:
Creation Timestamp: 2020-08-28T18:25:40Z
Generation: 3
Owner References:
API Version: eventing.knative.dev/v1alpha1
Block Owner Deletion: true
Controller: true
Kind: Broker
Name: default
UID: 01b7234d-b3f5-4fa2-9387-be563728d133
Resource Version: 140892
Self Link: /apis/messaging.knative.dev/v1beta1/namespaces/default/inmemorychannels/default-kne-trigger
UID: e9b4158a-8940-4682-8360-42c7b0d93380
Spec:
Subscribers:
Delivery:
Dead Letter Sink:
Generation: 2
Reply Uri: http://default-broker.default.svc.cluster.local
Subscriber Uri: http://default-broker-filter.default.svc.cluster.local/triggers/default/default-test-trigger/2e991ecc-f95c-471a-b03f-719cb103363b
UID: 901f986d-dc9d-4445-913d-f27716b70ea4
Status:
Address:
URL: http://default-kne-trigger-kn-channel.default.svc.cluster.local
Conditions:
Last Transition Time: 2020-08-28T18:25:40Z
Status: True
Type: Ready
Observed Generation: 3
Subscribers:
Observed Generation: 2
Ready: True
UID: 901f986d-dc9d-4445-913d-f27716b70ea4
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal InMemoryChannelReconciled 2s (x6 over 3m12s) inmemorychannel-controller InMemoryChannel reconciled: "default/default-kne-trigger"
As you can see, the subscription is still there, with Generation: 2
.
I created the broker like this:
apiVersion: eventing.knative.dev/v1beta1
kind: Broker
metadata:
name: default
namespace: default
And the trigger like this:
apiVersion: eventing.knative.dev/v1beta1
kind: Trigger
metadata:
name: default-test-trigger
namespace: default
spec:
broker: default
subscriber:
ref:
apiVersion: serving.knative.dev/v1
kind: Service
name: re-detector-sessions
Could this somehow be a problem that's specific to my cluster? I'm not sure how... I'll take another look through the release notes to see if this was somehow fixed since 0.14..
Hey people, I'm going to give look at this soon
/assign
@MPeyrard86 which version of eventing are you using? eventing 0.14 is quite old and some things changed since then :smile:
@slinkydeveloper We are still using 0.14. But we're probably due for another upgrade.
Possibly related to https://github.com/knative/eventing-contrib/issues/1560
Ok, so tried to repro this from the head with those steps and I couldn't:
vaikas-a01:eventing-camel vaikas$ cat ~/repro-1520/broker.yaml
apiVersion: eventing.knative.dev/v1beta1
kind: Broker
metadata:
name: repro
namespace: default
vaikas-a01:eventing-camel vaikas$ cat ~/repro-1520/trigger.yaml
apiVersion: eventing.knative.dev/v1beta1
kind: Trigger
metadata:
name: default-test-trigger
namespace: default
spec:
broker: repro
subscriber:
ref:
apiVersion: serving.knative.dev/v1
kind: Service
name: event-display
vaikas-a01:eventing-camel vaikas$ kubectl create -f ~/repro-1520/broker.yaml
vaikas-a01:eventing-camel vaikas$ kubectl create -f ~/repro-1520/trigger.yaml
trigger.eventing.knative.dev/default-test-trigger created
vaikas-a01:eventing-camel vaikas$ kubectl get brokers
NAME URL AGE READY REASON
default http://default-broker.default.svc.cluster.local 6d18h True
repro http://broker-ingress.knative-eventing.svc.cluster.local/default/repro 8s True
vaikas-a01:eventing-camel vaikas$ kubectl get triggers
NAME BROKER SUBSCRIBER_URI AGE READY REASON
default-test-trigger repro http://event-display.default.svc.cluster.local 10s True
ping-trigger default http://subscriber.default.svc.cluster.local 6d18h True
vaikas-a01:eventing-camel vaikas$ kubectl get imc
NAME URL AGE READY REASON
my-channel http://my-channel-kn-channel.default.svc.cluster.local 76d True
my-channel-2 http://my-channel-2-kn-channel.default.svc.cluster.local 76d True
repro-kne-trigger http://repro-kne-trigger-kn-channel.default.svc.cluster.local 33s True
vaikas-a01:eventing-camel vaikas$ kubectl get imc repro-kne-trigger -oyaml
apiVersion: messaging.knative.dev/v1
kind: InMemoryChannel
metadata:
annotations:
eventing.knative.dev/scope: cluster
messaging.knative.dev/creator: system:serviceaccount:knative-eventing:eventing-controller
messaging.knative.dev/lastModifier: system:serviceaccount:knative-eventing:eventing-controller
messaging.knative.dev/subscribable: v1
creationTimestamp: "2020-09-16T17:46:04Z"
generation: 2
labels:
eventing.knative.dev/broker: repro
eventing.knative.dev/brokerEverything: "true"
name: repro-kne-trigger
namespace: default
ownerReferences:
- apiVersion: eventing.knative.dev/v1
blockOwnerDeletion: true
controller: true
kind: Broker
name: repro
uid: 80d72fbf-fe7d-43fb-b9c7-01a58ad1c3cf
resourceVersion: "97973844"
selfLink: /apis/messaging.knative.dev/v1/namespaces/default/inmemorychannels/repro-kne-trigger
uid: 5599bb58-8518-4d75-8411-fe3c55be1cb2
spec:
subscribers:
- generation: 1
replyUri: http://broker-ingress.knative-eventing.svc.cluster.local/default/repro
subscriberUri: http://broker-filter.knative-eventing.svc.cluster.local/triggers/default/default-test-trigger/d29cb073-f5fc-4590-9114-9c05530f548b
uid: 3b7b418a-c583-4ad1-99c1-a1a72ffd0f8b
status:
address:
url: http://repro-kne-trigger-kn-channel.default.svc.cluster.local
conditions:
- lastTransitionTime: "2020-09-16T17:46:04Z"
status: "True"
type: Addressable
- lastTransitionTime: "2020-09-16T17:46:04Z"
status: "True"
type: ChannelServiceReady
- lastTransitionTime: "2020-09-16T17:46:04Z"
status: "True"
type: DispatcherReady
- lastTransitionTime: "2020-09-16T17:46:04Z"
status: "True"
type: EndpointsReady
- lastTransitionTime: "2020-09-16T17:46:04Z"
status: "True"
type: Ready
- lastTransitionTime: "2020-09-16T17:46:04Z"
status: "True"
type: ServiceReady
observedGeneration: 2
subscribers:
- observedGeneration: 1
ready: "True"
uid: 3b7b418a-c583-4ad1-99c1-a1a72ffd0f8b
vaikas-a01:eventing-camel vaikas$ kubectl delete triggers default-test-trigger
trigger.eventing.knative.dev "default-test-trigger" deleted
vaikas-a01:eventing-camel vaikas$ kubectl get imc repro-kne-trigger -oyaml
apiVersion: messaging.knative.dev/v1
kind: InMemoryChannel
metadata:
annotations:
eventing.knative.dev/scope: cluster
messaging.knative.dev/creator: system:serviceaccount:knative-eventing:eventing-controller
messaging.knative.dev/lastModifier: system:serviceaccount:knative-eventing:eventing-controller
messaging.knative.dev/subscribable: v1
creationTimestamp: "2020-09-16T17:46:04Z"
generation: 3
labels:
eventing.knative.dev/broker: repro
eventing.knative.dev/brokerEverything: "true"
name: repro-kne-trigger
namespace: default
ownerReferences:
- apiVersion: eventing.knative.dev/v1
blockOwnerDeletion: true
controller: true
kind: Broker
name: repro
uid: 80d72fbf-fe7d-43fb-b9c7-01a58ad1c3cf
resourceVersion: "97976297"
selfLink: /apis/messaging.knative.dev/v1/namespaces/default/inmemorychannels/repro-kne-trigger
uid: 5599bb58-8518-4d75-8411-fe3c55be1cb2
spec: {}
status:
address:
url: http://repro-kne-trigger-kn-channel.default.svc.cluster.local
conditions:
- lastTransitionTime: "2020-09-16T17:46:04Z"
status: "True"
type: Addressable
- lastTransitionTime: "2020-09-16T17:46:04Z"
status: "True"
type: ChannelServiceReady
- lastTransitionTime: "2020-09-16T17:46:04Z"
status: "True"
type: DispatcherReady
- lastTransitionTime: "2020-09-16T17:46:04Z"
status: "True"
type: EndpointsReady
- lastTransitionTime: "2020-09-16T17:46:04Z"
status: "True"
type: Ready
- lastTransitionTime: "2020-09-16T17:46:04Z"
status: "True"
type: ServiceReady
observedGeneration: 3
So, I wonder if something was fixed along the way. When are you planning on ugprading? Also, FYI, there was an issue with some of the releases when going from .16 to .17, but it should have been fixed in the later dot releases. @Harwayne which is the one that has been fixed?
I have put in a request to get our Kubernetes version upgraded, as that is blocking us from upgrading Knative. As soon as that happens, I will be upgrading Knative and trying this again.
ok, thanks and sorry about the trouble :(
No worries. This isn't affecting production or anything. We're using Knative to develop a new feature that hasn't shipped yet.
Quick update: We're finally getting our Kubernetes upgraded to version 1.18. Should happen in the next week or two. I'll be upgrading Knative to 0.18 as soon as that's done and update this bug with the results.
I also met exactly same issue with @mpeyrard . @mpeyrard How about now, did you fix it by upgrading? Actually I met the same problem with IMC channel, knative-eventing 0.18 and 0.19 both. It seems upgrading cannot help this issue. @slinkydeveloper @vaikas, I feel it is easy to reproduce it. Because I am just knative beginner and use sample of quick start. Do we have clue on this issue and how can we walkround it?
@pandagodyyy when you say it's easy to reproduce, if you can reliably reproduce it that would be great? I think both @slinkydeveloper and I tried to repro this and couldn't. I'll try to see if I can do it from .19 today and report back. But in the meantime if you have an easy repro (esp. with IMC) that would be great to share :)
@vaikas, I left message and went to bed yesterday, and when I wakes up today, I found the triggers works as expected. It seem it took several hours for recovering. So this morning I tried reproduce it in a brand new cluster again, and it happened once more. I recorded it below Environment: K8s v1.18 ;pre-installed Istio 1.7.3; Knative eventin 1.9
kubectl.exe create namespace knative-eventing
kubectl.exe label namespace knative-eventing istio-injection=enabled
cat <<EOF | kubectl apply -f - apiVersion: "security.istio.io/v1beta1" kind: "PeerAuthentication" metadata: name: "default" namespace: "knative-eventing" spec: mtls: mode: PERMISSIVE EOF
kubectl apply --filename https://github.com/knative/eventing/releases/download/v0.19.0/eventing-crds.yaml
kubectl apply --filename https://github.com/knative/eventing/releases/download/v0.19.0/eventing-core.yaml
kubectl apply --filename https://github.com/knative/eventing/releases/download/v0.19.0/in-memory-channel.yaml
kubectl apply --filename https://github.com/knative/eventing/releases/download/v0.19.0/mt-channel-broker.yaml
kubectl.exe create namespace premaster
kubectl.exe label namespace premaster istio-injection=enabled
apiVersion: apps/v1 kind: Deployment metadata: name: hello-display namespace: premaster spec: replicas: 1 selector: matchLabels: &labels app: hello-display template: metadata: labels: *labels spec: containers:
kind: Service apiVersion: v1 metadata: name: hello-display namespace: premaster spec: selector: app: hello-display ports:
kubectl apply -f https://raw.githubusercontent.com/istio/istio/master/samples/sleep/sleep.yaml -n premaster
After all, check knative-eventing
check premster
Then start sending event to broker, meanwhile monitoring logs of hello-display .\kubectl.exe exec -it sleep-64d7d56698-gg8dd -n premaster -- /bin/sh
curl -v "http://broker-ingress.knative-eventing.svc.cluster.local/premaster/default" \ -X POST \ -H "Ce-Id: say-hello" \ -H "Ce-Specversion: 1.0" \ -H "Ce-Type: v1" \ -H "Ce-Source: not-sendoff" \ -H "Content-Type: application/json" \ -d '{"msg":"Hello World!"}'
Result: message was shown in hello-display
Then I modified trigger with new Type v2 apiVersion: eventing.knative.dev/v1 kind: Trigger metadata: name: test-display namespace: premaster spec: broker: default filter: attributes: type: v2 subscriber: ref: apiVersion: v1 kind: Service name: hello-display Sent previous message curl -v "http://broker-ingress.knative-eventing.svc.cluster.local/premaster/default" \ -X POST \ -H "Ce-Id: say-hello" \ -H "Ce-Specversion: 1.0" \ -H "Ce-Type: v1" \ -H "Ce-Source: not-sendoff" \ -H "Content-Type: application/json" \ -d '{"msg":"Hello World!"}'
Result: still was shown in hello-display
Send new message with v2 curl -v "http://broker-ingress.knative-eventing.svc.cluster.local/premaster/default" \ -X POST \ -H "Ce-Id: say-hello" \ -H "Ce-Specversion: 1.0" \ -H "Ce-Type: v2" \ -H "Ce-Source: not-sendoff" \ -H "Content-Type: application/json" \ -d '{"msg":"Hello World!"}' Result: nothing was shown (If after several hours, the modified trigger can work)
If I delete current trigger, and then create new one After I sent a message, nothing was shown in hello-display However, check the log by kubectl.exe logs -f mt-broker-filter-79c59cc4dc-spxws -c filter -n knative-eventing it complained such error {"level":"info","ts":"2020-12-03T04:28:33.828Z","logger":"mt_broker_filter","caller":"filter/filter_handler.go:170","msg":"Unable to get the Trigger","commit":"0f9a8c5","error":"trigger had a different UID. From ref '13a5a1d0-3201-4283-b251-b4532625298d'. From Kubernetes '6491443d-276e-4448-9227-6fb9c5374984'","triggerRef":"premaster/test-display"} error: unexpected EOF After around one hour. it came back to normal
I did not enable local cluster gateway (I do not know if it is necessary, event seems can work without local cluster gateway).
I have not yet tried with a higher version, because I'm still waiting for my DevOps team to upgrade our version of Kubernetes. I've since noticed that it does not happen 100% of the time. Work-around seems to be to re-install the broker.
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.
Describe the bug I have configured two Knative Services to communicate with each other via Knative Eventing. Our system manually creates the broker via yaml files (as opposed to installing the broker via annotation), as well as the trigger. The broker is backed by the Kafka Channel that is provided under Knative contrib. The service that produces the event does an HTTP POST to the broker address, which accepts the event with a 202 code. Using Kafka Tools, we have been able to confirm that the Cloud Event is deposited into the appropriate Kafka topic. However, the endpoint on the target service is never invoked. We have checked and re-checked that the
CE-type
attribute on the event matches the filter in the trigger. We have also double-checked that the endpoint on the target service is reachable by manually hitting it using postman.Upon investigating the logs in the broker filter logs, we find the error message:
I am not currently certain if this is a bug or some kind of misconfiguration on my end. However, if we assume that it is the latter, then this error message has not been very helpful in figuring out what is wrong.
Since it seems to be complaining about my Trigger specifically, I'll copy it here:
Expected behavior We expected the Cloud Events to arrive at the configured destination.
To Reproduce With the current state of my project, this is what I'm doing to reproduce it:
detector
and the other is calledrouter
. The detector looks at sensor data and sends Cloud Events to the router via the broker when a pattern is found. But I would expect simple placeholder services to suffice.CE-type
of the detection event, with therouter
service as the target.detector
that causes it to trigger a detection event.As stated above, the cloud event finds its way into the Kafka Channel's kafka topic for the broker, but no further. Furthermore, errors appear in the brokers' filter pod logs complaining about "trigger had a different UID".
Knative release version 0.14.1