Open universalmind303 opened 1 month ago
additionally,
a cross join followed by a filter comparing columns between the two inputs should be optimized into a inner join
example:
df1.join(df2, how='cross').where(df1['text'] == df2['name'])
this can be optimized to
df1.join(df2, left_on=col('text'), right_on=col('name') how='inner')
For reference, Datafusion has an eliminate_cross_join
rule that rewrites cross joins to inner joins where possible
created #3095 for the optimizer rule.
Is your feature request related to a problem? Please describe. I want to perform cross joins using daft
Describe the solution you'd like
df1.join(df2, how='cross')
Describe alternatives you've considered
df1.join(df2, on=lit(1))