Remove redundant RepartitionExec from plan

Is your feature request related to a problem or challenge? Please describe what you are trying to do. I was running benchmarks this morning and noticed this fragment of the query plan when running TPC-H query 5.

CoalesceBatchesExec: target_batch_size=4096
  RepartitionExec: partitioning=Hash([Column { name: "n_nationkey" }], 24)
    RepartitionExec: partitioning=RoundRobinBatch(24)
      ParquetExec: batch_size=8192,

It seems redundant to repartition with round-robin and then immediately repartition again using a hash.

A more complex example:

RepartitionExec: partitioning=Hash([Column { name: "r_regionkey" }], 24)
  CoalesceBatchesExec: target_batch_size=4096
    FilterExec: r_name = ASIA
      RepartitionExec: partitioning=RoundRobinBatch(24)
        ParquetExec: batch_size=8192,

In this example, we could push the repartition on r_regionkey down to the scan.

Describe the solution you'd like Implement new optimizer rule to remove redundant repartitions and/or push down repartitions.

Describe alternatives you've considered None

Additional context None

apache / datafusion

Remove redundant RepartitionExec from plan #384