Open linhr opened 1 month ago
Here are more examples.
>>> df.join(df2, 'name', 'outer').sort(desc(df.name)).show()
+-----+----+------+
| name| age|height|
+-----+----+------+
| Bob| 5| 85|
|Alice| 2| NULL|
| Tom|NULL| 80|
+-----+----+------+
>>> df.join(df2, 'name', 'outer').sort(desc(df2.name)).show()
+-----+----+------+
| name| age|height|
+-----+----+------+
| Tom|NULL| 80|
| Bob| 5| 85|
|Alice| 2| NULL|
+-----+----+------+
Spark handles projection differently for outer join outputs, based on how the join column is specified in
.select()
. Here is an example for the behavior that we should support.The example is extended from
DataFrame.join()
doctest in Spark.