Explore swapping build table for left outer joins

Is your feature request related to a problem? Please describe. When performing a left outer join on a small left input and a large right input, we're going to use the large input to build the hash table and the small input to probe it. This could lead to poor performance on the GPU because there's potentially many collisions during the build and potentially lower parallelism on the probe side after the build.

Describe the solution you'd like In this scenario, we could use the much smaller left input as the hash table, but then we need to solve the problem of identifying rows in the left input that were not "hit' during the probe from the right input. In this case we could take a similar approach that we do for full outer joins, which is use the left input gather map as a scatter map for false to build a filter mask identifying which rows were hit during the probe, and the remaining rows after the filter are the ones not matched to any row in the right input.

NVIDIA / spark-rapids

Explore swapping build table for left outer joins #11234