cda-group / arcon

State-first Streaming Applications in Rust
https://cda-group.github.io/arcon/
Apache License 2.0
175 stars 17 forks source link

Adding key_by() function to Stream<T> #299

Closed adamhass closed 2 years ago

adamhass commented 2 years ago

Adds the method key_by(...) to Stream

Type signature:

pub fn key_by<F, KEY>(mut self, key_extractor: F) -> Stream<T>
    where
        KEY: Hash + 'static,
        F: Fn(&T) -> &KEY + ArconFnBounds,

ChannelKind::Forward

Since keyed ChannelKind is actually meaningful now it is no longer the default ChannelStrategy of operators. Instead the "Forward" channel strategy is the default and it has a slightly modified behaviour. If Operator A with parallelism Pa (nodes A1, A2... APa) sends messages, using the forward strategy, to Operator B with parallelism Pb (nodes B1, B2... BPb) then node A1 sends to B1 and A2 to B2 etc. Note: The program will fail to build the graph at runtime if Pa>Pb>1. If Pb>Pa there will be unused nodes in B.

Refactoring

Also includes refactoring of the constructor.rs file and refactoring the way Application/AssembledApplication structs.

Testing

I suppose the key_by() feature should be more well tested eventually but I believe this is about as good of a place as any to say "good enough", as the API is done and basic functionality is tested in the new integration test file.

Max-Meldrum commented 2 years ago

Nice work! 😄 I have a few change requests for this PR, but then other things that we can create separate issues for.