ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.9k stars 5.76k forks source link

[Data] Support order preservation for operators #35724

Open amogkam opened 1 year ago

amogkam commented 1 year ago

Description

Streaming executor does not preserve order by default. There is a global flag for preserving order.

However, the decision to preserve order should be done at the operator level, not globally. Currently, if any of the operators are a zip or sort operation, ordering is preserved globally. But for cases like map -> map -> sort, the first 2 maps don't need to preserve order.

The following operators should have order preservation:

  1. zip
  2. sort and all downstream operations
  3. range and range_tensor

Use case

No response

anyscalesam commented 1 year ago

map operation and stream.split() already supports this cc @raulchen