confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
113 stars 1.04k forks source link

SHOW TOPICS; not returning an accurate list of topics #2297

Open cedw93 opened 5 years ago

cedw93 commented 5 years ago

When running a SHOW TOPICS; command in ksql I expect to see the true list of the currently defined Kafka topics

If I start my ksql client and run SHOW TOPICS; I get a list of topics like so;

ksql> SHOW TOPICS;

 Kafka Topic            | Registered | Partitions | Partition Replicas | Consumers | ConsumerGroups
----------------------------------------------------------------------------------------------------
 CEDW_TOPIC             | false      | 8          | 4                  | 0         | 0
 EVENTS                 | false      | 20         | 4                  | 0         | 0
 eventsInvalid          | true       | 20         | 4                  | 0         | 0
 eventsValid            | true       | 20         | 4                  | 0         | 0
 resultsCompleteMessage | true       | 20         | 4                  | 0         | 0
 TEST_TOPIC             | false      | 20         | 4                  | 0         | 0
----------------------------------------------------------------------------------------------------

this matches the expected output as I have the following topics in kafka topics in kafka:

kafka-topics --zookeeper zookeeper:2181 --list
CEDW_TOPIC
EVENTS
TEST_TOPIC
__confluent.support.metrics
__consumer_offsets
_confluent-ksql-ksql_command_topic
eventsInvalid
eventsValid
resultsCompleteMessage

The problem comes when I create new topics. I created a new topic in Kafka called ksqlTest which I then expected it to show in the ksql output. But this is not the case:

Kafka topics:

kafka-topics --zookeeper zookeeper:2181 --list
CEDW_TOPIC
EVENTS
TEST_TOPIC
__confluent.support.metrics
__consumer_offsets
_confluent-ksql-ksql_command_topic
eventsInvalid
eventsValid
ksqlTest
resultsCompleteMessage

ksql output

ksql> SHOW TOPICS;

 Kafka Topic            | Registered | Partitions | Partition Replicas | Consumers | ConsumerGroups
----------------------------------------------------------------------------------------------------
 CEDW_TOPIC             | false      | 8          | 4                  | 0         | 0
 EVENTS                 | false      | 20         | 4                  | 0         | 0
 eventsInvalid          | true       | 20         | 4                  | 0         | 0
 eventsValid            | true       | 20         | 4                  | 0         | 0
 resultsCompleteMessage | true       | 20         | 4                  | 0         | 0
 TEST_TOPIC             | false      | 20         | 4                  | 0         | 0
----------------------------------------------------------------------------------------------------

You can see the topic ksqlTest is missing from this output (and the __ ones are but I expect that's intended).

This also applies if I create more topics the SHOW TOPICS; output never seems to update to include the new topics. It doesn't seem to ever update from its initial list.

Is this a problem or is this intended behavior? Is there a way to always ensure the correct list of topics in return.

ksql properties:

INFO KsqlConfig values:
    ksql.extension.dir = ext
    ksql.output.topic.name.prefix =
    ksql.persistent.prefix = query_
    ksql.schema.registry.url = http://localhost:8081
    ksql.service.id = ksql
    ksql.sink.partitions = 4
    ksql.sink.replicas = 1
    ksql.sink.window.change.log.additional.retention = 1000000
    ksql.statestore.suffix = _ksql_statestore
    ksql.transient.prefix = transient_
    ksql.udf.collect.metrics = false
    ksql.udf.enable.security.manager = true
    ksql.udfs.enabled = true
    ssl.cipher.suites = null
    ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
    ssl.endpoint.identification.algorithm = https
    ssl.key.password = null
    ssl.keymanager.algorithm = SunX509
    ssl.keystore.location = null
    ssl.keystore.password = null
    ssl.keystore.type = JKS
    ssl.protocol = TLS
    ssl.provider = null
    ssl.secure.random.implementation = null
    ssl.trustmanager.algorithm = PKIX
    ssl.truststore.location = null
    ssl.truststore.password = null
    ssl.truststore.type = JKS

I also have SASL SCRAM with the JAAS config configured which seems to be working as it does manage to get the topic list at least once correctly.

apurvam commented 5 years ago

Thanks for reporting this issue @cedw93 . I looked at the code, and we certainly don't cache the topic list anywhere. This isn't expected behavior. Are you seeing this consistently, even after a very long period of time passes? What may be happening is that since kafka topic creation is not synchronous, your KSQL client may be hitting a broker which doesn't have the updated metadata yet, and hence returns stale data. But the window where this happens should be very small.