knative / eventing

Event-driven application platform for Kubernetes
https://knative.dev/docs/eventing
Apache License 2.0
1.41k stars 590 forks source link

Introduce option for Istio support #6596

Closed apo-ger closed 4 months ago

apo-ger commented 1 year ago

Problem Currently, @kimwnasptd and I are trying to setup Knative Eventing with strict mTLS, as part of Kubeflow. The main issue we bumped into is the fact that someone needs to manually create DestinationRules/VirtualServices https://github.com/knative/eventing/pull/6283 https://github.com/istio/istio/issues/13193#issuecomment-482995169 https://github.com/istio/istio/issues/24886#issuecomment-648218564.

It could help adopters of Knative Eventing, that have a requirement for strict mTLS, if there would be an option in the Eventing Controller to create the required Istio resources.

Persona: Event Producers

Without strict mTLS we can't have any AuthorizationPolicies to control who can talk to the broker-ingress and filter https://github.com/knative/eventing/issues/6175.

Thus in a multi-user environment, like Kubeflow, everyone would be able to create events for all user namespaces.

Additional context (optional) We understand that Knative Eventing no-longer has a dependency on Istio (https://github.com/knative/eventing/issues/294).

But, this means that the logic of creating the necessary resources for Knative Eventing to work with mTLS falls down to end users. We believe the Eventing Controller should:

  1. Have an option for toggling Istio support, which will be off by default
  2. If the option is on then
    • It's the Eventing Controller's job to ensure the resources created for a Broker CR can work with mTLS
    • The reconciliation loop will create the required DestinationRule or VirtualService

This way we'll avoid duplication of effort for adopters of Knative Eventing, where every one of us will need to rewrite this logic.

We would like to help in this effort, if you agree with our proposal.

pierDipi commented 1 year ago

Hi @apo-ger, thanks for opening this issue, I'm happy to collaborate with you on this, this requires a feature track doc, see mechanics in https://github.com/knative/community/blob/main/mechanics/FEATURE-TRACKS.md.

apo-ger commented 1 year ago

Hello @pierDipi, thank you for offering to help! Here is our first iteration on the feature track doc: https://docs.google.com/document/d/1Y9BukjzUxfl920KltVtH4L5gta0pHOt5NEsLqdgZNY4/edit?pli=1#heading=h.pl50sgewlr05

Feel free to take a look and let us know your thoughts regarding the proposed approach.

pierDipi commented 1 year ago

Thanks @apo-ger for creating the document, I left some comments there.

apo-ger commented 1 year ago

Hello @pierDipi! Thank you for the comments. I've included here and in the feature doc our latest findings.

We have validated that currently all three available KafkaChannel implementations suffer from the same issue, they do not work with Istio and strict mTLS mode, as they use an externalName service underneath to route events.

Knative Eventing KafkaChannel: https://github.com/knative-sandbox/eventing-kafka-broker//control-plane/pkg/reconciler/channel/channel.go#L627-L657

Consolidated KafkaChannel: https://github.com/knative-sandbox/eventing-kafka/pkg/channel/consolidated/reconciler/controller/kafkachannel.go#L495-L538

Distributed KafkaChannel: https://github.com/knative-sandbox/eventing-kafka/pkg/channel/distributed/controller/kafkachannel/channel.go#L56-L159

First issue

We have seen the following path for delivering an event when using a MT channel-based broker with KafkaChannels underneath:

client/producer -> broker-ingress -> kafka-ch-receiver -> Kafka -> kafka-ch-dispatcher -> broker-filter -> subscriber/consumer

The communication between broker-ingress and kafka-ch-receiver breaks when strict mTLS is enabled as the broker-ingress is trying to reach the receiver via an externalName service

Deployed resources --- `Kafka ChannelTemplate ConfigMap` ```yaml apiVersion: v1 kind: ConfigMap metadata: name: kafka-channel namespace: knative-eventing data: channel-template-spec: | apiVersion: messaging.knative.dev/v1beta1 kind: KafkaChannel spec: numPartitions: 3 replicationFactor: 1 ``` `MT channel-based Broker` ```yaml apiVersion: eventing.knative.dev/v1 kind: Broker metadata: annotations: eventing.knative.dev/broker.class: MTChannelBasedBroker eventing.knative.dev/creator: kubernetes-admin eventing.knative.dev/lastModifier: kubernetes-admin creationTimestamp: "2022-12-22T15:54:33Z" generation: 1 name: default namespace: kubeflow-user-example-com resourceVersion: "7417993" uid: 3b554be4-7bbd-4345-9ebb-0e7073b361e8 spec: config: apiVersion: v1 kind: ConfigMap name: kafka-channel namespace: knative-eventing status: address: url: http://broker-ingress.knative-eventing.svc.cluster.local/kubeflow-user-example-com/default annotations: knative.dev/channelAPIVersion: messaging.knative.dev/v1beta1 knative.dev/channelAddress: http://default-kne-trigger-kn-channel.kubeflow-user-example-com.svc.cluster.local knative.dev/channelKind: KafkaChannel knative.dev/channelName: default-kne-trigger conditions: - lastTransitionTime: "2022-12-22T15:54:34Z" status: "True" type: Addressable - lastTransitionTime: "2022-12-22T15:54:34Z" message: No dead letter sink is configured. reason: DeadLetterSinkNotConfigured severity: Info status: "True" type: DeadLetterSinkResolved - lastTransitionTime: "2022-12-22T15:54:34Z" status: "True" type: FilterReady - lastTransitionTime: "2022-12-22T15:54:34Z" status: "True" type: IngressReady - lastTransitionTime: "2022-12-22T15:54:34Z" status: "True" type: Ready - lastTransitionTime: "2022-12-22T15:54:34Z" status: "True" type: TriggerChannelReady observedGeneration: 1 ``` `KafkaChannel` ```yaml apiVersion: messaging.knative.dev/v1beta1 kind: KafkaChannel metadata: annotations: eventing.knative.dev/scope: cluster messaging.knative.dev/subscribable: v1 creationTimestamp: "2022-12-22T15:54:33Z" finalizers: - kafkachannels.messaging.knative.dev generation: 2 labels: eventing.knative.dev/broker: default eventing.knative.dev/brokerEverything: "true" name: default-kne-trigger namespace: kubeflow-user-example-com ownerReferences: - apiVersion: eventing.knative.dev/v1 blockOwnerDeletion: true controller: true kind: Broker name: default uid: 3b554be4-7bbd-4345-9ebb-0e7073b361e8 resourceVersion: "7418018" uid: 4b8ab66a-2815-44de-96b8-4ad4c65ca101 spec: numPartitions: 3 replicationFactor: 1 retentionDuration: PT168H subscribers: - generation: 1 replyUri: http://broker-ingress.knative-eventing.svc.cluster.local/kubeflow-user-example-com/default subscriberUri: http://broker-filter.knative-eventing.svc.cluster.local/triggers/kubeflow-user-example-com/message-dumper-trigger/3ef7d7cc-681c-4676-a5f9-292e8b25599b uid: 07931388-bca1-4a94-8d29-0b8103a65380 status: address: url: http://default-kne-trigger-kn-channel.kubeflow-user-example-com.svc.cluster.local conditions: - lastTransitionTime: "2022-12-22T15:54:34Z" status: "True" type: Addressable - lastTransitionTime: "2022-12-22T15:54:33Z" reason: Config map knative-eventing/kafka-channel-channels-subscriptions updated status: "True" type: ConfigMapUpdated - lastTransitionTime: "2022-12-22T15:54:33Z" status: "True" type: ConfigParsed - lastTransitionTime: "2022-12-22T15:54:33Z" status: "True" type: DataPlaneAvailable - lastTransitionTime: "2022-12-22T15:54:34Z" status: "True" type: ProbeSucceeded - lastTransitionTime: "2022-12-22T15:54:34Z" status: "True" type: Ready - lastTransitionTime: "2022-12-22T15:54:33Z" reason: Topic knative-messaging-kafka.kubeflow-user-example-com.default-kne-trigger created status: "True" type: TopicReady observedGeneration: 2 subscribers: - observedGeneration: 1 ready: "True" uid: 07931388-bca1-4a94-8d29-0b8103a65380 ``` `ExternalName Service` ```yaml apiVersion: v1 kind: Service metadata: creationTimestamp: "2022-12-22T15:54:33Z" labels: messaging.knative.dev/role: kafka-channel name: default-kne-trigger-kn-channel namespace: kubeflow-user-example-com ownerReferences: - apiVersion: messaging.knative.dev/v1beta1 blockOwnerDeletion: true controller: true kind: KafkaChannel name: default-kne-trigger uid: 4b8ab66a-2815-44de-96b8-4ad4c65ca101 resourceVersion: "7417973" uid: c7f2bea6-e688-4cd1-bf87-c03a23404c56 spec: externalName: kafka-channel-ingress.knative-eventing.svc.cluster.local sessionAffinity: None type: ExternalName status: loadBalancer: {} ``` ---

Produced Error when trying to hit the broker from a curl Pod with an Istio sidecar:

root@curl:/ ]$ curl -X POST -v -H "content-type: application/json" -H "ce-specversion: 1.0" \
 -H "ce-source: my/curl/command" -H "ce-type: my.demo.event" -H "ce-id: 0815" \
 -d '{"value":"Hello Knative"}' http://broker-ingress.knative-eventing.svc.cluster.local/kubeflow-user-example-com/default

> POST /kubeflow-user-example-com/default HTTP/1.1
> User-Agent: curl/7.35.0
> Host: broker-ingress.knative-eventing.svc.cluster.local
> Accept: */*
> content-type: application/json
> ce-specversion: 1.0
> ce-source: my/curl/command
> ce-type: my.demo.event
> ce-id: 0815
> Content-Length: 25
> 
< HTTP/1.1 503 Service Unavailable
< allow: POST, OPTIONS
< date: Mon, 12 Dec 2022 17:29:17 GMT
< content-length: 0
< x-envoy-upstream-service-time: 31
< server: envoy

Second issue

In addition, we found that externalName services need to have port mappings configured in order to work properly with Istio and strict mTLS. See relative issues/PRs:

This PR has resolved this issue for the InMemoryChannel implementation, but we need to do the same for the underlying externalName services of a KafkaChannel implementation.

Solution Proposal

  1. All underlying externalName services must have port mappings configured. We could extend the existing core channel reconcilers to add the appropriate port mappings.
  2. Create an external component/reconciler/repository that will be adding the Istio support
    • Once a new Channel object is created, this new reconciler will reconcile the required Istio resources (i.e DestinationRule)
  3. Introduce a global MESH_MODE setting (off by default) to control wether we operate in a mesh or not. This setting can be enabled/disabled via a ConfigMap.
github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

pierDipi commented 1 year ago

/remove-lifecycle stale

We're running e2e tests for non channel components with istio in https://github.com/knative-sandbox/eventing-istio. We're missing some updates on the feature track based on the comments and discussions and then we could start implementing it

pierDipi commented 1 year ago

/triage accepted

pierDipi commented 1 year ago

I did an initial iteration for the implementation of this feature in https://github.com/knative-sandbox/eventing-istio/pull/16

pierDipi commented 1 year ago

@apo-ger I don't have access to the document anymore, can you please make it for public comment again?

It's useful for us to not lose our decision context using those feature track documents

kimwnasptd commented 1 year ago

hey @pierDipi, unfortunately @apo-ger and I no longer work at Arrikto. Most probably they have completely disabled/removed the initial gdrive we had use for this document.

Thankfully I have a latest copy of that document in my personal drive https://docs.google.com/document/d/1SFcFWblIfQE-KV05EKvI16E8goBOVs4SMqZ4YqIZs3A/edit?usp=sharing

The above document though is missing the comment history, but at least we still have the context

pierDipi commented 1 year ago

Thanks @kimwnasptd, I made a copy for the Knative drive workspace https://docs.google.com/document/d/1p4OdOUFaWw7g4xQKtJIR6ml8bBOqKJxw700xc2pX4Co/edit

pierDipi commented 4 months ago

Documented here https://knative.dev/development/eventing/features/istio-integration/