confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
113 stars 1.04k forks source link

Blocker: compatibility break in joins #6406

Closed big-andy-coates closed 3 years ago

big-andy-coates commented 4 years ago

Description

AK commit apache/kafka#9156 introduced a compatibility breaking change: it's causing table source nodes to be materialized into a state store, even through the later mapValues node is already materialized. This MUST be fixed before we release off of master.

The following historic tests have been disabled:

ksqldb-functional-tests/src/test/resources/historicalplans/joins-_table_table_join_with_where_clause/6.0.0_1594233291874 ksqldb-functional-tests/src/test/resources/historicalplans/joins-_table_table_join_with_where_clause/6.1.0_1594164287151

The topology has changed due to the Apache Kafka change:

old topology

Topologies:
   Sub-topology: 0
    Source: KSTREAM-SOURCE-0000000001 (topics: [left_topic])
      --> KTABLE-SOURCE-0000000002
    Source: KSTREAM-SOURCE-0000000007 (topics: [right_topic])
      --> KTABLE-SOURCE-0000000008
    Processor: KTABLE-SOURCE-0000000002 (stores: [])
      --> KTABLE-MAPVALUES-0000000003
      <-- KSTREAM-SOURCE-0000000001
    Processor: KTABLE-SOURCE-0000000008 (stores: [])
      --> KTABLE-MAPVALUES-0000000009
      <-- KSTREAM-SOURCE-0000000007
    Processor: KTABLE-MAPVALUES-0000000003 (stores: [KafkaTopic_Left-Reduce])
      --> KTABLE-TRANSFORMVALUES-0000000004
      <-- KTABLE-SOURCE-0000000002
    Processor: KTABLE-MAPVALUES-0000000009 (stores: [KafkaTopic_Right-Reduce])
      --> KTABLE-TRANSFORMVALUES-0000000010
      <-- KTABLE-SOURCE-0000000008
    Processor: KTABLE-TRANSFORMVALUES-0000000004 (stores: [])
      --> PrependAliasLeft
      <-- KTABLE-MAPVALUES-0000000003
    Processor: KTABLE-TRANSFORMVALUES-0000000010 (stores: [])
      --> PrependAliasRight
      <-- KTABLE-MAPVALUES-0000000009
    Processor: PrependAliasLeft (stores: [])
      --> KTABLE-JOINTHIS-0000000013
      <-- KTABLE-TRANSFORMVALUES-0000000004
    Processor: PrependAliasRight (stores: [])
      --> KTABLE-JOINOTHER-0000000014
      <-- KTABLE-TRANSFORMVALUES-0000000010
    Processor: KTABLE-JOINOTHER-0000000014 (stores: [KafkaTopic_Left-Reduce])
      --> KTABLE-MERGE-0000000012
      <-- PrependAliasRight
    Processor: KTABLE-JOINTHIS-0000000013 (stores: [KafkaTopic_Right-Reduce])
      --> KTABLE-MERGE-0000000012
      <-- PrependAliasLeft
    Processor: KTABLE-MERGE-0000000012 (stores: [])
      --> WhereFilter-ApplyPredicate
      <-- KTABLE-JOINTHIS-0000000013, KTABLE-JOINOTHER-0000000014
    Processor: WhereFilter-ApplyPredicate (stores: [])
      --> WhereFilter-Filter
      <-- KTABLE-MERGE-0000000012
    Processor: WhereFilter-Filter (stores: [])
      --> WhereFilter-PostProcess
      <-- WhereFilter-ApplyPredicate
    Processor: WhereFilter-PostProcess (stores: [])
      --> Project
      <-- WhereFilter-Filter
    Processor: Project (stores: [])
      --> KTABLE-TOSTREAM-0000000019
      <-- WhereFilter-PostProcess
    Processor: KTABLE-TOSTREAM-0000000019 (stores: [])
      --> KSTREAM-SINK-0000000020
      <-- Project
    Sink: KSTREAM-SINK-0000000020 (topic: OUTPUT)
      <-- KTABLE-TOSTREAM-0000000019

new topology:

Note how the KSTREAM-SOURCE nodes have state stores now.

Topologies:
Sub-topology: 0
Source: KSTREAM-SOURCE-0000000001 (topics: [left_topic])
--> KTABLE-SOURCE-0000000002
Source: KSTREAM-SOURCE-0000000007 (topics: [right_topic])
--> KTABLE-SOURCE-0000000008
Processor: KTABLE-SOURCE-0000000002 (stores: [left_topic-STATE-STORE-0000000000])
--> KTABLE-MAPVALUES-0000000003
<-- KSTREAM-SOURCE-0000000001
Processor: KTABLE-SOURCE-0000000008 (stores: [right_topic-STATE-STORE-0000000006])
--> KTABLE-MAPVALUES-0000000009
<-- KSTREAM-SOURCE-0000000007
Processor: KTABLE-MAPVALUES-0000000003 (stores: [KafkaTopic_Left-Reduce])
--> KTABLE-TRANSFORMVALUES-0000000004
<-- KTABLE-SOURCE-0000000002
Processor: KTABLE-MAPVALUES-0000000009 (stores: [KafkaTopic_Right-Reduce])
--> KTABLE-TRANSFORMVALUES-0000000010
<-- KTABLE-SOURCE-0000000008
Processor: KTABLE-TRANSFORMVALUES-0000000004 (stores: [])
--> PrependAliasLeft
<-- KTABLE-MAPVALUES-0000000003
Processor: KTABLE-TRANSFORMVALUES-0000000010 (stores: [])
--> PrependAliasRight
<-- KTABLE-MAPVALUES-0000000009
Processor: PrependAliasLeft (stores: [])
--> KTABLE-JOINTHIS-0000000013
<-- KTABLE-TRANSFORMVALUES-0000000004
Processor: PrependAliasRight (stores: [])
--> KTABLE-JOINOTHER-0000000014
<-- KTABLE-TRANSFORMVALUES-0000000010
Processor: KTABLE-JOINOTHER-0000000014 (stores: [KafkaTopic_Left-Reduce])
--> KTABLE-MERGE-0000000012
<-- PrependAliasRight
Processor: KTABLE-JOINTHIS-0000000013 (stores: [KafkaTopic_Right-Reduce])
--> KTABLE-MERGE-0000000012
<-- PrependAliasLeft
Processor: KTABLE-MERGE-0000000012 (stores: [])
--> WhereFilter-ApplyPredicate
<-- KTABLE-JOINTHIS-0000000013, KTABLE-JOINOTHER-0000000014
Processor: WhereFilter-ApplyPredicate (stores: [])
--> WhereFilter-Filter
<-- KTABLE-MERGE-0000000012
Processor: WhereFilter-Filter (stores: [])
--> WhereFilter-PostProcess
<-- WhereFilter-ApplyPredicate
Processor: WhereFilter-PostProcess (stores: [])
--> Project
<-- WhereFilter-Filter
Processor: Project (stores: [])
--> KTABLE-TOSTREAM-0000000019
<-- WhereFilter-PostProcess
Processor: KTABLE-TOSTREAM-0000000019 (stores: [])
--> KSTREAM-SINK-0000000020
<-- Project
Sink: KSTREAM-SINK-0000000020 (topic: OUTPUT)
<-- KTABLE-TOSTREAM-0000000019
big-andy-coates commented 4 years ago

Needs this AK fix: https://issues.apache.org/jira/browse/KAFKA-10494

agavra commented 4 years ago

Adding needs-triage so that we will assign it in our weekly meeting.

big-andy-coates commented 4 years ago

Removed needs triage label as I'm already working on it :p

apurvam commented 3 years ago

@big-andy-coates is this on track to be fixed in the coming week. cc @vpapavas

apurvam commented 3 years ago

also cc @mjsax who may have context.

mjsax commented 3 years ago

The AK ticket linked above is already merged: https://github.com/confluentinc/ksql/issues/6406#issuecomment-707243636 -- not sure if there is any ksql follow up work to do or if this can be closed.