knative / eventing

Event-driven application platform for Kubernetes
https://knative.dev/docs/eventing
Apache License 2.0
1.42k stars 595 forks source link

Mismatch between subscription and underlying channel #2580

Closed vaikas closed 3 years ago

vaikas commented 4 years ago

There's been a lot of flakiness as of late, and as part of tracking it down, jotting down some of the things that I've seen happen. This happens typically with the many trigger tests. Here's typically what happens in those cases, the subscription thinks it's in a good state, IMC does not have the subscription and Trigger does not come ready because subscription is actually not ok.

Here's an example object dumps:


- apiVersion: eventing.knative.dev/v1beta1
  kind: Trigger
  metadata:
    annotations:
      eventing.knative.dev/creator: vaikas@vmware.com
      eventing.knative.dev/lastModifier: vaikas@vmware.com
    creationTimestamp: "2020-02-14T13:45:28Z"
    generation: 1
    labels:
      eventing.knative.dev/broker: default
    name: trigger-testany-testany--extname1-extval1-extname2-extvalue2
    namespace: test-default-broker-with-many-attribute-triggers-using-v1bl5hbn
    resourceVersion: "3326057"
    selfLink: /apis/eventing.knative.dev/v1beta1/namespaces/test-default-broker-with-many-attribute-triggers-using-v1bl5hbn/triggers/trigger-testany-testany--extname1-extval1-extname2-extvalue2
    uid: efe9a871-f2b5-4024-a775-7d61d164255c
  spec:
    broker: default
    filter:
      attributes:
        extname1: extval1
        extname2: extvalue2
        source: ""
        type: ""
    subscriber:
      ref:
        apiVersion: v1
        kind: Service
        name: dumper-testany-testany--extname1-extval1-extname2-extvalue2
        namespace: test-default-broker-with-many-attribute-triggers-using-v1bl5hbn
  status:
    conditions:
    - lastTransitionTime: "2020-02-14T13:45:35Z"
      status: "True"
      type: BrokerReady
    - lastTransitionTime: "2020-02-14T13:45:35Z"
      status: "True"
      type: DependencyReady
    - lastTransitionTime: "2020-02-14T13:45:38Z"
      message: 'Failed to get subscription status: [subscription "default-trigger-testany-te-efe9a871-f2b5-4024-a775-7d61d164255c"
        not present in channel "default-kne-trigger" subscriber''s list]'
      reason: SubscriptionNotMarkedReadyByChannel
      status: Unknown
      type: Ready
    - lastTransitionTime: "2020-02-14T13:45:38Z"
      message: 'Failed to get subscription status: [subscription "default-trigger-testany-te-efe9a871-f2b5-4024-a775-7d61d164255c"
        not present in channel "default-kne-trigger" subscriber''s list]'
      reason: SubscriptionNotMarkedReadyByChannel
      status: Unknown
      type: Subscribed
    - lastTransitionTime: "2020-02-14T13:45:35Z"
      status: "True"
      type: SubscriberResolved
    observedGeneration: 1
    subscriberUri: http://dumper-testany-testany--extname1-extval1-extname2-extvalue2.test-default-broker-with-many-attribute-triggers-using-v1bl5hbn.svc.cluster.local/
- apiVersion: messaging.knative.dev/v1alpha1
  kind: Subscription
  metadata:
    annotations:
      messaging.knative.dev/creator: system:serviceaccount:knative-eventing:eventing-controller
      messaging.knative.dev/lastModifier: system:serviceaccount:knative-eventing:eventing-controller
    creationTimestamp: "2020-02-14T13:45:35Z"
    finalizers:
    - subscriptions.messaging.knative.dev
    generation: 1
    labels:
      eventing.knative.dev/broker: default
      eventing.knative.dev/trigger: trigger-testany-testany--extname1-extval1-extname2-extvalue2
    name: default-trigger-testany-te-efe9a871-f2b5-4024-a775-7d61d164255c
    namespace: test-default-broker-with-many-attribute-triggers-using-v1bl5hbn
    ownerReferences:
    - apiVersion: eventing.knative.dev/v1alpha1
      blockOwnerDeletion: true
      controller: true
      kind: Trigger
      name: trigger-testany-testany--extname1-extval1-extname2-extvalue2
      uid: efe9a871-f2b5-4024-a775-7d61d164255c
    resourceVersion: "3326194"
    selfLink: /apis/messaging.knative.dev/v1alpha1/namespaces/test-default-broker-with-many-attribute-triggers-using-v1bl5hbn/subscriptions/default-trigger-testany-te-efe9a871-f2b5-4024-a775-7d61d164255c
    uid: b28c05c4-aceb-4277-8052-a5510674c2da
  spec:
    channel:
      apiVersion: messaging.knative.dev/v1alpha1
      kind: InMemoryChannel
      name: default-kne-trigger
    reply:
      ref:
        apiVersion: eventing.knative.dev/v1alpha1
        kind: Broker
        name: default
        namespace: test-default-broker-with-many-attribute-triggers-using-v1bl5hbn
    subscriber:
      uri: http://default-broker-filter.test-default-broker-with-many-attribute-triggers-using-v1bl5hbn.svc.cluster.local/triggers/test-default-broker-with-many-attribute-triggers-using-v1bl5hbn/trigger-testany-testany--extname1-extval1-extname2-extvalue2/efe9a871-f2b5-4024-a775-7d61d164255c
  status:
    conditions:
    - lastTransitionTime: "2020-02-14T13:45:37Z"
      status: "True"
      type: AddedToChannel
    - lastTransitionTime: "2020-02-14T13:45:43Z"
      status: "True"
      type: ChannelReady
    - lastTransitionTime: "2020-02-14T13:45:43Z"
      status: "True"
      type: Ready
    - lastTransitionTime: "2020-02-14T13:45:36Z"
      status: "True"
      type: Resolved
    observedGeneration: 1
    physicalSubscription:
      replyURI: http://default-broker.test-default-broker-with-many-attribute-triggers-using-v1bl5hbn.svc.cluster.local
      subscriberURI: http://default-broker-filter.test-default-broker-with-many-attribute-triggers-using-v1bl5hbn.svc.cluster.local/triggers/test-default-broker-with-many-attribute-triggers-using-v1bl5hbn/trigger-testany-testany--extname1-extval1-extname2-extvalue2/efe9a871-f2b5-4024-a775-7d61d164255c

apiVersion: v1
items:
- apiVersion: messaging.knative.dev/v1alpha1
  kind: InMemoryChannel
  metadata:
    annotations:
      messaging.knative.dev/creator: system:serviceaccount:knative-eventing:eventing-controller
      messaging.knative.dev/lastModifier: system:serviceaccount:knative-eventing:eventing-controller
    creationTimestamp: "2020-02-14T13:45:23Z"
    generation: 16
    labels:
      eventing.knative.dev/broker: default
      eventing.knative.dev/brokerEverything: "true"
    name: default-kne-trigger
    namespace: test-default-broker-with-many-attribute-triggers-using-v1bl5hbn
    ownerReferences:
    - apiVersion: eventing.knative.dev/v1alpha1
      blockOwnerDeletion: true
      controller: true
      kind: Broker
      name: default
      uid: a5baec7f-fdac-474a-9026-0cd0bf69a775
    resourceVersion: "3328016"
    selfLink: /apis/messaging.knative.dev/v1alpha1/namespaces/test-default-broker-with-many-attribute-triggers-using-v1bl5hbn/inmemorychannels/default-kne-trigger
    uid: 7988ad8f-6526-422c-8f9a-57ae2f544d0d
  spec:
    subscribable: {}
  status:
    address:
      hostname: default-kne-trigger-kn-channel.test-default-broker-with-many-attribute-triggers-using-v1bl5hbn.svc.cluster.local
      url: http://default-kne-trigger-kn-channel.test-default-broker-with-many-attribute-triggers-using-v1bl5hbn.svc.cluster.local
    conditions:
    - lastTransitionTime: "2020-02-14T13:45:23Z"
      status: "True"
      type: Addressable
    - lastTransitionTime: "2020-02-14T13:45:23Z"
      status: "True"
      type: ChannelServiceReady
    - lastTransitionTime: "2020-02-14T13:45:23Z"
      status: "True"
      type: DispatcherReady
    - lastTransitionTime: "2020-02-14T13:45:23Z"
      status: "True"
      type: EndpointsReady
    - lastTransitionTime: "2020-02-14T13:45:23Z"
      status: "True"
      type: Ready
    - lastTransitionTime: "2020-02-14T13:45:23Z"
      status: "True"
      type: ServiceReady
    observedGeneration: 16
    subscribableStatus: {}

What's peculiar is that the channel has its generation match its observedgeneration at 16. But notice the subscriber array (in spec and in status are both empty).

vaikas commented 4 years ago

Forgot to mention that looking at the IMC logs, there are tons of messages about updatestatus failures due to concurrent writes.

lionelvillard commented 4 years ago

Right and then it fails because of the default controller rate limiter (15 failures wait 163s for next retry).

Both the subscription controller and the in-memory controller are modifying the channel status. Maybe only one controller should do it.

grantr commented 4 years ago

IIUC the subscription controller shouldn't be modifying the channel status. It should only be modifying the spec.

lionelvillard commented 4 years ago

ah yes you are right :-) Same problem though, right?

grantr commented 4 years ago

Yes :)

vaikas commented 4 years ago

What I find extremely odd is that the Spec is empty for subsribers. So even if updatestatus fails, seems odd that there's no subscriptions marked despite the generation indicating that 16 updates to the spec have been done.

github-actions[bot] commented 3 years ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.