knative-extensions / eventing-kafka-broker

Alternate Kafka Broker implementation.
Apache License 2.0
170 stars 117 forks source link

External managed topic on Azure Event Hub is unable to be consumed from Kafka Broker #2692

Closed marcjimz closed 1 year ago

marcjimz commented 2 years ago

I'm running an Azure Event Hub with scoped credentials (can read) but my broker is failing on setting up the Kafka admin client. I am able to consume events using custom Python code and other frameworks (Argo EventSources) but the broker class is limiting my access on this.

I am hitting this error: https://github.com/knative-sandbox/eventing-kafka-broker/blob/6c7d57ef751c6e6502a38759ea8cb8f274623506/control-plane/pkg/reconciler/broker/broker.go#L277

I have setup my configMap:

apiVersion: v1
data:
  auth.secret.ref.name: knative-eventing-poc-main-secret
  bootstrap.servers: <INSTANCE>.servicebus.windows.net:9093
  default.topic.partitions: "4"
  default.topic.replication.factor: "3"
kind: ConfigMap
metadata:
  name: knative-eventing-poc-main-config

with secret (decoded):

password: Endpoint=sb://<INSTANCE>.servicebus.windows.net/;SharedAccessKeyName=ak-01;SharedAccessKey=<KEY>;EntityPath=<EP>
protocol: SASL_SSL 
sasl.mechanism: PLAIN 
user: $ConnectionString

I'm constantly reproducing errors:

'Warning' reason: 'InternalError' failed to get contract configuration: cannot obtain Kafka cluster admin, kafka: client has run out of available brokers to talk to: dial tcp 52.168.117.34:9093: i/o timeout

I've been able to confirm I can reach the topics fine on the same cluster, same nodes, etc. so networking is ruled out.

marcjimz commented 2 years ago

I presume the solution here is to simply validate that the topic exists, so any client can suffice (and not necessarily the admin client).

pierDipi commented 2 years ago

Hi @marcjimz, thanks for reporting!

I'm running an Azure Event Hub with scoped credentials (can read)

We need to describe the topics as well.

I presume the solution here is to simply validate that the topic exists, so any client can suffice (and not necessarily the admin client).

That's precisely what we do but the error is a network connection error: dial tcp 52.168.117.34:9093: i/o timeout and

I've been able to confirm I can reach the topics fine on the same cluster, same nodes, etc. so networking is ruled out.

makes it even more strange.

I can definitely use external topics with an external Kafka cluster without problems but I don't use Azure Event Hub.

marcjimz commented 2 years ago

Yeah, don't think it's that. I think it's specific to the Kafka client and what it is trying to do with AEH.

nc -vz 52.168.117.34 9093
Connection to 52.168.117.34 9093 port [tcp/*] succeeded!
marcjimz commented 2 years ago

So I was able to create a new Azure Event Hub to test and it works fine, however only with the Root Key. So this is an issue with access rights and how the broker requests access. When I use the root managed key, the connection works fine. But as soon as I use the scoped access key for a specific Kafka topic, that is when the client begins to fail. The key scoped only has access to a specific topic and specific rights, typically from manage, send and listen. In this instance we only have listening rights.

The scope for the access key here is strictly listen events. Is there a way to configure that with a broker? Or is this beyond the scope of the broker object.

pierDipi commented 2 years ago

The scope for the access key here is strictly listen events. Is there a way to configure that with a broker? Or is this beyond the scope of the broker object.

What exactly do you want to configure?

As I wrote earlier:

We need to describe the topics as well.

I'm not familiar with Event Hub so I don't know how auth is configured but it seems a missing permission problem to me. We don't need admin credentials, I've tried with real Kafka clusters and scoped ACLs, and it worked fine.

marcjimz commented 2 years ago

The scoped config has full access to a single Kafka topic (singular Hub) and it fails authentication.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.