TimelyDataflow / differential-dataflow

An implementation of differential dataflow using timely dataflow on Rust.
MIT License
2.51k stars 182 forks source link

Suggestion: WebAssembly support #479

Open oli-w opened 2 months ago

oli-w commented 2 months ago

Hi, I was interested in seeing if I could run differential-dataflow on the web by compiling to WebAssembly using wasm-pack and it worked out really well! However it required some ~hacks~ changes to differential-dataflow and timely-dataflow, specifically:

The changes required are:

I was wondering if you would be interested in incorporating this or would prefer to keep this kind of change as a fork?

frankmcsherry commented 2 months ago

It sounds interesting! I'll need a bit to get my head around what it entails, but it seems quite reasonable to 1. limit the use of assumptions about time, and 2. be as serious as possible about usize sizes.

If it's alright, I'll take a few to read up on things. Mostly, I'd love to avoid having code in TD/DD that I don't understand, because it's largely on me if something is broken elsewhere. But I have to imagine if nothing else a feature flag for time would be really easy, as should be fixing DD's 64 bit assumption (which is probably a bug anyhow).

antiguru commented 2 months ago

I think all changes look fine from a high level. The time changes don't seem to change any behavior on non-wasm targets. The RHH code isn't used anywhere at this point. Before committing to supporting wasm, I'd like to know if there's a regression test we could have. It seems testing isn't as simple as it is for other targets, but if we can get it into CI then I don't think there's anything blocking this.

As @frankmcsherry points out, there are some parts of the code that might assume usize to have the same length as u64. Removing this assumption would be a win in itself.

frankmcsherry commented 2 months ago

I've started to take a look (sorry for the delay) and have some quick thoughts:

  1. The DD changes .. could probably just instead be deletion of the associated code. At least, the time changes are a. an unused _timer, and b. a YieldingIter that isn't used other than in some commented-out Kafka code (it could become commented out also, and place a similar obligation on uncommenting the code as the rdkafka connections). If we clean this all up, then no one needs timers at all in here.
  2. The DD changes miss an important use in the dogsdogsdogs project, which is the only load-bearing subproject (though I can understand that it isn't obvious). Specifically, half_join.rs uses Instant as an argument to the yield_function, allowing the user to call .elapsed() if they are interested and not otherwise. We could also just not have dogs3 compile on web assembly / delay that work until someone needs it.
  3. The TD changes are harder to avoid, but do seem some amount of harmless. There's some weirdly dead-ish code (viz Worker::timer()), but things like scheduling, sequencing, and activation seem to want to do things based on external notions of time, and logging just seems somewhat wedged without access to time.

I'm going to look into a DD PR that essentially tracks your changes (fixing RHH) but subtracts out the time-based code rather than modifies it. I'll report back here about that.

frankmcsherry commented 2 months ago

The RHH fix seems more subtle than at first glance. The code also has elsewhere this

        /// Indicates both the desired location and the hash signature of the key.
        fn desired_location<K: Hashable>(&self, key: &K) -> usize {
            let hash: usize = key.hashed().into().try_into().unwrap();
            hash / self.divisor
        }

where the try_into() is from u64 to usize.

The code is not currently active, though .. it could become so in the future when I get some time. Ideally it wouldn't silently break the WASM builds, though this is an example of where it could/would.

My understanding of WASM is that 64-bit types are fine, just that usize is 32 bits. So, probably the right thing here (and perhaps elsewhere) is to be more serious about either using u64 throughout, or pivoting to usize earlier (and using it throughout).

oli-w commented 1 month ago

Apologies for my lack of reply sooner - I've been away. Thanks for making progress on these changes 😃.

The DD changes miss an important use in the dogsdogsdogs project

Ah yes, I haven't used that before.

We could also just not have dogs3 compile on web assembly / delay that work until someone needs it.

I'd be very happy with that - I have had a lot of fun composing DD operators without having to reach for what dogsdogsdogs provides.

My understanding of WASM is that 64-bit types are fine, just that usize is 32 bits

Yep that's right, similar to using u128 on a 64-bit machine. There may be other issues that aren't exposed until runtime of some code paths - full disclosure: I tried out making the surgical changes required to get DD working in WASM, found issues, fixed, rinse-and-repeat, rather than reading through all the code to find where usize being 32 might be problematic. The left shift was picked up by the compiler - you can reproduce with cargo build --target wasm32-unknown-unknown.

Let me know if there's something I can help with that doesn't involve changing production "load-bearing" code. E.g. setting up some tests as suggested - details for WASM: https://rustwasm.github.io/wasm-bindgen/wasm-bindgen-test/index.html. Note this does run the compiled WASM using Node JS. Or I can provide a basic WASM example for motivation - play around with an interactive dataflow by visiting a web page and clicking on some buttons.