Is your feature request related to a problem? Please describe.
Suppose you have a stream of orders and need a "pending_orders" table that always contains all orders with the "pending" status. That means they are added when their status changes to "pending" (already possible) and removed when it changes to something else (not possible).
Describe the solution you'd like
Making producing tombstones the default behavior is not desirable, because the current behavior does have many legitimate use-cases.
Instead, it should probably be added to the query, e.g.:
CREATE TABLE ... AS SELECT ... EMIT TOMBSTONES
CREATE TABLE ... WITH (tombstones = true) AS SELECT ...
These queries would then produce tombstones for all records that are filtered out by some condition (JOIN/WHERE/HAVING), and optionally only for those with keys that are already in the result table.
Describe alternatives you've considered
Just filtering out records doesn't work, because it filters out the messages but doesn't produce tombstones for those that used to fulfill the filter condition. This means once they make it to the result table they stay there forever.
One simple workaround is to use a dedicated result column, so e.g. SELECT *, (status = "pending") AS result FROM .... The obvious problem here is that the result table always contains all records, regardless of whether they match the condition or not.
Additional context
1945 contains a proposed solution that supposedly does this already. I wasn't able to reproduce it with the current ksqlDB version though, so the behavior seems to have changed at some point.
Is your feature request related to a problem? Please describe. Suppose you have a stream of orders and need a "pending_orders" table that always contains all orders with the "pending" status. That means they are added when their status changes to "pending" (already possible) and removed when it changes to something else (not possible).
Describe the solution you'd like Making producing tombstones the default behavior is not desirable, because the current behavior does have many legitimate use-cases.
Instead, it should probably be added to the query, e.g.:
CREATE TABLE ... AS SELECT ... EMIT TOMBSTONES
CREATE TABLE ... WITH (tombstones = true) AS SELECT ...
These queries would then produce tombstones for all records that are filtered out by some condition (
JOIN
/WHERE
/HAVING
), and optionally only for those with keys that are already in the result table.Describe alternatives you've considered Just filtering out records doesn't work, because it filters out the messages but doesn't produce tombstones for those that used to fulfill the filter condition. This means once they make it to the result table they stay there forever.
One simple workaround is to use a dedicated
result
column, so e.g.SELECT *, (status = "pending") AS result FROM ...
. The obvious problem here is that the result table always contains all records, regardless of whether they match the condition or not.Additional context
1945 contains a proposed solution that supposedly does this already. I wasn't able to reproduce it with the current ksqlDB version though, so the behavior seems to have changed at some point.