laysakura / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
5 stars 7 forks source link

[Feature Request]: Stop using `Any` (statically-typed Pipeline) #4

Closed laysakura closed 1 year ago

laysakura commented 1 year ago

Currently, Any are used everywhere:

e.g. https://github.com/laysakura/beam/blob/4bfdd2642ba5b88396190a06aee1fe0e22086d99/sdks/rust/src/internals/serialize.rs#L59

IMO, one of the important feature of Beam Rust SDK should be statically-typed Pipeline (with generics support).

I already did some work for that in 87b9939392 targeting at nivaldoh's HEAD (almost same diffs with https://github.com/nivaldoh/beam/pull/26).

But we should proceed the work to robertwb's work: https://github.com/nivaldoh/beam/pull/22 (I already hand-merged it in 9564b4eb60 )

dahlbaek commented 1 year ago

While working on #3 I ran into https://github.com/laysakura/beam/blob/28b838ec51725d181fdbe9508b13448d4431fb7f/sdks/rust/src/internals/serialize.rs#L13-L29 which made debugging somewhat tedious. As far as I can tell it's an unnecessary indirection 🤔 it seems to me a more idiomatic solution in Rust would be to make Transformations/Operators generic as needed so they can carry the closures directly, which should make it possible to remove a lot of the current usage of Box/Any. Do you agree @laysakura, or is there something I've misunderstood?

laysakura commented 1 year ago

@dahlbaek I definitely agree with you!

laysakura commented 1 year ago

Found out that statically-typed pipelines (including user-defined types) are unrealistic via: