substrait-io / substrait

A cross platform way to express data transformation, relational algebra, standardized record expression and plans.
https://substrait.io
Apache License 2.0
1.16k stars 150 forks source link

feat: normalize the join types #662

Closed EpsilonPrime closed 1 month ago

EpsilonPrime commented 2 months ago

There are two groups of join types in the definition with differing enums.

This PR leaves JoinRel's SEMI and ANTI as the canonical names for LEFT_SEMI and LEFT_ANTI. Aliases are not allowed due to JSON (and text) serialization behavior.

This PR also adds RIGHT_SEMI and RIGHT_ANTI to JoinRel's JoinType.

RIGHT_SINGLE is added to all types. The PR correspondingly renames SINGLE to LEFT_SINGLE, ANTI TO LEFT_ANTI, and SEMI to LEFT_SEMI. Finally this PR adds LEFT_SINGLE to all of the other join types.

BREAKING CHANGE: JoinRel's type enum now has LEFT_SINGLE instead of SINGLE. Similarly there is now LEFT_ANTI and LEFT_SEMI. Other values are available in all join type enums. This affects JSON and text formats only (binary plans -- the interoperable part of Substrait -- will still be compatible before and after this change).

EpsilonPrime commented 1 month ago

I've updated the PR to rename SINGLE to LEFT_SINGLE and have removed primary/secondary from the text.

There's still a looming breaking change in the future which is replacing all of the enums with a single version with the same values but that's best handled in a major release with a number of other major breaking changes.