TimelyDataflow / timely-dataflow

A modular implementation of timely dataflow in Rust
MIT License
3.26k stars 271 forks source link

[examples] using a struct type as input data #218

Open sclasen opened 5 years ago

sclasen commented 5 years ago

after the getting started materials, I attempted to start pushing a struct through a dataflow graph and ran into a few snags.

Not super clear/didnt see examples of using non primitive data, such as InputHandle::<_, MyStruct>::new();

once I got past that, and had a super basic dataflow, I wasnt able to call exchange with that type, and that lead me into finding the Data, ExchangeData traits, trying to implement them, and into abomonation, etc...

Is there any example material like this that I missed/am I doing it wrong?

frankmcsherry commented 5 years ago

This is a really good point. It's not exactly non-primitive data, because things like tuples and vectors and such should work out just fine, but when you get to custom struct and enum types you'll have issues and I think we have approximately zero text talking you through that. It can be as simple as derive(Abomonation) but there is no way you could guess that.

If there were a section in the timely mdbook, would that likely have solved things for you? Or: from your experience, where would you have first liked to see this material?

sclasen commented 5 years ago

Yep, a section in the mdbook would have worked great for me.

Had guessed about the derive(Abomonation) bit, but didn’t have the crate in my project yet so opened this issue first, glad I was on the right track, will give that a go too. Thanks!

sclasen commented 5 years ago

worked, once I pulled in abomination_derive 🎉

frankmcsherry commented 5 years ago

That's great to hear! I'm on the road at the moment, but when I get an hour or so I'll lay down some text on this, and on the bincode flag that lets you use the Serde framework and its traits (which are more commonly implemented by default, but which require a bit more copying around at runtime).

frankmcsherry commented 5 years ago

I've just added this text on custom datatypes. It is in the "advanced timely dataflow" section, but it should show up on the table of contents and ideally cause people to read it if they stumble upon the issues.

Let me know if it looks like it is roughly the right material. It could for sure be longer with worked examples, which I'm up for digging in to (I always lose track of what is non-obvious to others, and have to ask, sorry!).

jesskfullwood commented 5 years ago

Drive-by comment, on a few occasions I have used f64 inside my graphs. This cannot be done 'natively' because f64 does not impl Hash, Ord etc but works fine if you wrap it in OrderedFloat like

#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, PartialOrd, Ord)]
struct Float(OrderedFloat<f64>);
impl Abomonation for Float {}

Would you consider this to be a Disgusting Hack or is it Just Fine?

frankmcsherry commented 5 years ago

It is probably just fine. Differential requires Ord so that it can put data in canonical forms, and Hash to distribute data among workers (if the float is a key). In neither case is there a required semantic interpretation, just a consistent ordering and hashing implementation, so it should all be good!

sclasen commented 5 years ago

Looking good, this would have helped me.

Perhaps a simple example thats quite close to the existing 'hello world' stuff, with a struct which is used in exchange like .exchange(|x| x.my_partition_field ) as a callout to some more detail around exchange would be good. Im assuming that it is roughly a partitioning key, so please lmk if thats not in the ballpark :)

frankmcsherry commented 5 years ago

I've got a simple example almost ready to land, except that the Abomonation flavor doesn't work. Apparently the derive macros balk at recursive types (I was trying a tree data structure). Going to sort that out, but the text is mostly ready to go once I can make the example work (it not working is also a bug that you would hit your head against for a while).