constellation-rs / amadeus

Harmonious distributed data analysis in Rust.
https://constellation.rs/amadeus
Apache License 2.0
472 stars 26 forks source link

Adding an operation #101

Open alecmocatta opened 3 years ago

alecmocatta commented 3 years ago

There are two kinds of operation: adapter1 operations (like map, flat_map, filter, chain) and reducer operations (like sum, max, collect, fold).

Adding a reducer operation

  1. introduce a type that implements ParallelSink, probably in a new file in amadeus-core/src/par_sink (don't forget to add mod new_file; to amadeus-core/src/par_sink.rs).
    • amadeus-core/src/par_sink/count.rs and amadeus-core/src/par_sink/histogram.rs are good starting points for how to implement the FolderSync trait.
    • The FolderSync trait and the folder_par_sink macro are a convenience to minimise boilerplate implementing ParallelSink for operations which can be implemented as a synchronous fold operation, which is most of them.
  2. add the user-facing methods to create this type in amadeus-core/src/par_pipe.rs and amadeus-core/src/par_stream.rs
  3. (ideally, I know some are missing!) add a test in tests, probably in a new file with the same name as 1.
  4. Make clippy and rustfmt happy cargo clippy --all-targets and cargo fmt --all
  5. Open a PR!

Adding an adapter operation

  1. introduce a type that implements ParallelStream and ParallelPipe, probably in a new file in amadeus-core/src/par_stream (don't forget to add mod new_file; to amadeus-core/src/par_stream.rs).

  1. I'm sure there's a better name than "adapter operations", please comment if you have one!