Open mcrumiller opened 1 month ago
One small tweak to your workaround
df2 = df2.with_columns(
expr1.alias("name_1"),
expr2.alias("name_2"),
expr3.alias("name_3"),
)
df1.join(df2,
on=["name_1", "name_2", "name_3"],
how="left",
coalesce=True,
)
I'm guessing that it was an implementation difficulty that made it the way it is rather than a planned decision. Maybe to do with handling aliases.
@deanm0000 the problem is that name_1
, name_2
, and name_3
already exist in df2
, but I need to do an operation on them, for example a .shift
, so I cannot re-alias to the same name.
I updated my example above to make that a bit clearer.
It adds a lot of complexity and I don't think I am not sure it makes sense either.
Description
The documentation for
join
states the following for thecoalesce
parameter:I'm not sure why joining on other expressions disables coalescing. To get around this behavior, I find myself constantly using the following patterns:
Instead, it would be preferable to be able to do: