confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
127 stars 1.04k forks source link

Source names with '.' (dot) character should be allowed #4955

Open Perdjesk opened 4 years ago

Perdjesk commented 4 years ago

Describe the bug The following PR https://github.com/confluentinc/ksql/pull/3695/files#diff-734b7c162230742e416f79fdf85d146dR41-R48 (/cc @agavra) introduces a validation of source name that do not match the allowed Kafka topic names. The "." character is missing in the ParserUtil implementation.

See Kafka topic name validation:

https://github.com/apache/kafka/blob/90bbeedf52f4b6a411e9630dd132583afa4cd428/clients/src/main/java/org/apache/kafka/common/internals/Topic.java#L74-L88

To Reproduce

ksql> CREATE TABLE "test.s" AS SELECT * FROM ta;

Expected behavior

ksql> CREATE TABLE "test.s" AS SELECT * FROM ta;

does not throw an exception and create the table.

Actual behaviour

ksql> CREATE TABLE "test.s" AS SELECT * FROM ta;
Illegal argument at Line: 1, Col: 14. Source names may only contain alphanumeric values, '_' or '-'. Got: 'test.s'

Workaround Use the WITH clause to specify a Kafka topic which contains dots. See: https://github.com/confluentinc/ksql/issues/4955#issuecomment-1155479201

Additional context

agavra commented 4 years ago

Hello @Perdjesk thanks for reporting this! I'm afraid this might not be a simple change (see https://github.com/confluentinc/ksql/pull/4965#pullrequestreview-385955315). Is this blocking you for something or is it just a nice-to-have?

Perdjesk commented 4 years ago

@agavra I thought this limitation was as well affecting the source topic specified in WITH clause in kafka_topic, but apparently not. This can be switched from a bug to an enhancement then.

cledesma commented 3 years ago

This is a problem for those consuming change data capture from Debezium connectors. The names of the topics take the format "serverName.databaseName.tableName" as default. I'm looking for a workaround to specify topic names, as "." is not allowed in ksql (though allowed in the broker).

Perdjesk commented 3 years ago

@cledesma If you want to create ksqldb tables you can use the WITH clause [1] which should work with doted Kafka named topics.

If you are looking into removing the dot from your source topics in the first place then it is more an issue to look into how you create topics and not ksqldb itself. If you are using Debezium you can manipulate the topic's name using the Debezium topic routing built-in KafkaConnect SMT [2] or create your own.

[1] https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/create-table/ [2] https://debezium.io/documentation/reference/configuration/topic-routing.html#_topic_names

bbondarets commented 3 years ago

Hi there @Perdjesk , I've faced with similar issue. However we use confluent cloud version of Debezium connector. Is there a way to do such manipulations(way of how connector names source topics from CDC) with cloud connector version?

Thanks in advance, Bohdan.

Perdjesk commented 3 years ago

@bbondarets I have no knowledge of Confluent Cloud. You might want to reach out to Confluent and/or investigate which version of Debezium connector is used and whether the documentation of the previous answer is applicable.

jhonatanTeixeira commented 2 years ago

This made ksqldb completely useles for me, my entire company uses dots on topic names to sperate topics into namespaces. Such a great tool i wont be able to use because a validation that shouldnt be there. Its a great tool anyway, by the way i also need to work with debezium, and now i will have to work around with kava streams

CiroDiMarzo commented 2 years ago

+1 on this case. It is a deal breaker for many use cases since we are already very much invested in Kafka+Debezium.

agavra commented 2 years ago

Just to clarify for people watching this ticket, ksqlDB supports working with topics that have a . in the name - you just need to declare the source name differently. For example:

CREATE STREAM foo_bar WITH(kafka_topic='foo.bar', ...);

As far as allowing . in the stream/table name itself that's still a TODO item. I'd love to understanding why that's blocking you (cc @jhonatanTeixeira and @CiroDiMarzo) to help us prioritize that.

CiroDiMarzo commented 2 years ago

@agavra, ignorance of this possibility was blocking me, that is.

isbee commented 1 year ago

@agavra This is problematic when using CREATE STREAM/TABLE AS SELECT, because SELECT does not support WITH(kafka_topic='foo.bar', ...).

So If I want to create materialized stream/table view supports both push/pull query, I need to CREATE STREAM/TABLE foo_bar WITH(kafka_topic='foo.bar', ...) and then CREATE STREAM/TABLE AS SELECT FROM foo_bar.

As I understand it, stream/table is just a reference so creating two stream/tables doesn't seem to be a big problem. But it's inconvenient.