Open ericdahl opened 3 months ago
@ericdahl Interesting, so far I do not have access to a MSK cluster, I will try and then I will develop this -I'm looking sponsorship for this kind of things (deploy cluster on MSK, EKS (strimzi), Confluent...)-. Also if you have access to one, would you like to make a PR?
Let me know how I can help please.
@sauljabin just saw this issue, because we needed it and started integrating it together with @jjoeris for ourselves. So if you can wait a few days, we might be able to provide a PR for you
@sauljabin just saw this issue, because we needed it and started integrating it together with @jjoeris for ourselves. So if you can wait a few days, we might be able to provide a PR for you
This is awesome, thanks!
Hi I have enabled PR on the pipeline https://github.com/sauljabin/kaskade/commit/8a0c9f4fb77c6e150c20657231f7297b0c0aa431
FYI @sauljabin it seems that this issue in the confluent-kafka-python library also affects us here: https://github.com/confluentinc/confluent-kafka-python/issues/1713
Their workaround is to include a poll and wait, so that the oauth callback completes. We will first proceed with the workaround, but mentioning the issue here, since we want to track it, and once solved upstream, we can remove the workaround.
Since we want to use kaskade in a docker image to provide users a simple method to spin up a useful client, we would continue implementing it with the proposed workaround and provide you with a PR for that, once it is complete
Thanks for your help!, let's keep track of the issue together 👍🏽
Hi @sauljabin, we tested several aproaches using poll and sleep in the code, but this approach seems to be not reliable with the confluent kafka libary. Sometimes you get data, sometimes not. Its not predictable. Also hardcoding the tokens in the code do not work (just for testing, if its a timing issue with the token generation). We might have missed something, as we are not so deep in your code, so what about remote debuging session with you for an hour or so?
But in general, I would recommend either
Hi @jjoeris thanks for looking at this,
I do not like too much https://github.com/dpkp/kafka-python or https://github.com/wbarnha/kafka-python-ng. I would prefer to use the official one, if confluent fixes the problem that would be great for the whole community.
I would prefer to wait for the fix or to help them on fixing it.
@jjoeris and @aleicher what do you both think?
@sauljabin agree. Changing the whole library in kaskade is not a small thing, and everything else works nicely, so changing the library for just one feature is not necessary. We might be able to help them fix it upstream, and upvote the issue there, to get it solved.
We got kaskade to connect to MSK using IAM authentication, also received messages using the consumer. Admin we did not manage to get up and running at all, and even the Consumer was not very reliable. If you think that you might have an idea of something we did wrong, this is our WIP, which at least consumes "sometimes": https://github.com/kotaicode/kaskade/commit/4e9c9dbf4e0f8de108ff81e5570626660a73e757
So if you think, having some shared session (screenshare etc.) feel free to ping and we timebox it. General feeling is: we wait for oauth callbacks to be properly implemented upstream in the library, then revisit it.
Until then, the best thing to be done when connecting to AWS MSK would be to also add username/password based auth to the AWS MSK instance
hi @aleicher have you tried with seek and assign? https://www.conduktor.io/kafka/java-consumer-seek-and-assign/
I'm using subscribe, but maybe we do not need it (indeed could be better, we do not need to wait until the rebalance is done), the only extra value that we need for this (I guess) is the low water mark, we can obtain it with get_watermark_offsets.
I'm not sure if this is going to work but it is an option, let me know.
hi @sauljabin,
i tested the seek and assign approach, but it did not help. Same error as before. So lets wait for a fix
Agreed, thanks
Is your feature request related to a problem? Please describe.
With AWS MSK, we use the IAM Authentication mechanism (documented here). It would be great it we could leverage this directly - rather than needing to add additional authentication mechanisms
Describe the solution you'd like simple config options to enable this, enabling
sasl_mechanism='OAUTHBEARER'
and the minimal required parameters (AWS region). It should search for AWS credentials automatically via the AWS SDK (so there may be no need to have any parameters around this, if assuming standard conventions)