rajasekarv / vega

A new arguably faster implementation of Apache Spark from scratch in Rust
Apache License 2.0
2.23k stars 206 forks source link

Sort by #70

Closed return02 closed 4 years ago

return02 commented 4 years ago

Implement sort by transform by a very simple range_partitioner.

This algorithm is almost the same with Apache Spark: partition all the data into ordered partitions and sort them separately.

There're still some work to be done:

  1. Find a better algorithm for building range_bounds.
  2. implement descending.
  3. use binary search in method get_partitions().
  4. perhaps F: SerFunc(&Self::Item) -> K + Clone is better than F: SerFunc(Self::Item) -> K.
  5. test corner case.
iduartgomez commented 4 years ago

Will review further later but could you rebase onto master (so cargo.lock change does not show) and run cargo fmt?

return02 commented 4 years ago

Will review further later but could you rebase onto master (so cargo.lock change does not show) and run cargo fmt?

sorry. I'll rebase master and use cargo fmt later.