confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
128 stars 1.04k forks source link

LIKE keyword doesn't work on strings containing newline character '\n' #6756

Open Jean-Jean-Grinbert opened 3 years ago

Jean-Jean-Grinbert commented 3 years ago

When a persistent query has a WHERE clause. And this WHERE clause uses the LIKE keyword for pattern matching on a string field. And the string field contains a newline character "\n". Then the LIKE pattern always returns false.

To Reproduce

  1. Use KSQLDB standalone version 0.9.0 or any subsequent version until current version 0.13.0.

  2. Create 3 Kafka topics: -InputTopic -AlertsTopic -EverythingElseTopic

  3. Run the following KSQLDB script:

CREATE STREAM InputTopic_STREAM (myfield VARCHAR) WITH (kafka_topic='InputTopic', value_format='JSON');

CREATE STREAM AlertsTopic_STREAM WITH (kafka_topic='AlertsTopic', value_format='JSON') AS SELECT * FROM InputTopic_STREAM WHERE myfield LIKE '%alert%' EMIT CHANGES;

CREATE STREAM EverythingElseTopic_STREAM WITH (kafka_topic='EverythingElseTopic', value_format='JSON') AS SELECT * FROM InputTopic_STREAM WHERE myfield NOT LIKE '%alert%' EMIT CHANGES;

  1. Send the following JSON message into topic InputTopic using the REST API (or any other way) { "myfield": "Some text. alert \n" }

Expected behavior The message should be streamed to AlertsTopic and NOT streamed to EverythingElseTopic.

Actual behaviour The message is streamed to EverythingElseTopic and NOT streamed to AlertsTopic .

Additional context If your remove the '\n' character then it works as expected. If you downgrade to ksqldb server version 0.8.0, then it works as expected.

andsty commented 1 year ago

do you have any update on this?

reneveyj commented 6 months ago

As a workaround, you can use something like this:

SELECT * FROM my_stream
WHERE REGEXP_REPLACE(my_string, '\\s*', '') LIKE '%foo%'
EMIT CHANGES;

It replaces all the whitespaces of the string before the LIKE. You may have to edit this if the value in your LIKE clause contains spaces though.