confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
78 stars 1.04k forks source link

Issue with clustering using Interactive KsqlDB deployment #9672

Open gmcruz opened 1 year ago

gmcruz commented 1 year ago

So I have a problem with Clustering KSQLDB. The problem is not with clustering it is really with 2 of the 3 access modes in Interactive ksqlDB deployment mode with k8s deployment scaled above 1. Understanding that the REST API is experimental still (I think) but the Java Client should be 100 percent, right?

I also verified this by standing up a FULL K8s Cluster on a completely separate cloud provider and going over the same steps, the same issue can be recreated.

AN AGGREGATE from a series of streams to table (MESSAGE_COUNT). SELECT COUNT FROM MESSAGE_COUNT WHERE USERID = [UUID]

kubectl scale deployment.apps/ksqldb-deployment --replicas=1

1) KsqlDB CLI - No problem here. SELECT COUNT FROM MESSAGE_COUNT... 2) KsqlDB Java Client - No problem here. Running a select connecting to the K8s ksqldb service.

kubectl scale deployment.apps/ksqldb-deployment --replicas=2

I check here that the correct topics have the correct consumers added and this new POD has joined the cluster correctly.

1) KsqlDB CLI

kubectl scale deployment.apps/ksqldb-deployment --replicas=1

Once I scale back to 1 within a few seconds all goes back to working as above with replica 1. There is no doubt that something is going on here. I have done this experiment multiple times, recreating the issue every time.
baliberdin commented 1 year ago

@gmcruz I saw a similar problem and figured it was a misconfiguration of ksql.advertised.listener. When ksql is run as a cluster, each node needs to expose a reachable address. I don't know if this is it, but I think it's worth a try

CrazedCoderNate commented 1 year ago

Having the same issue. Any chance on an update with this? Scaling manually 5 pods using KSQL, seems to be throwing issue on simultaneous queries to the same KSQL table.

gmcruz commented 1 year ago

Having the same issue. Any chance on an update with this? Scaling manually 5 pods using KSQL, seems to be throwing issue on simultaneous queries to the same KSQL table.

Truth is I gave up trying to figure it out. I spent sSO much time on it. I instead dropped KSQLDB in favor of KafkaStreams app. Based on this tutorial: medium.com /bakdata/queryable-kafka-topics-with-kafka-streams-8d2cca9de33f

Sorry I know its not what you want to here but for me it was worth the change. hope it helps.