ariskk / flink4s

Scala 3.x wrapper for Apache Flink
MIT License
49 stars 10 forks source link

Flink Backend #21

Open adschwartz opened 2 years ago

adschwartz commented 2 years ago

This is not an issue but more a comment and question.

@ariskk, looking at your code I see that you are a Cats practitioner (as am I). I also use Cats Effect a lot with Blaze, e.g. http4s, fs2 kafka streams and other popular libraries. I have a good bit of pure functional code that I'd love to be able to use in Flink. As you well know, Flink is a Java backend and uses Akka and Futures and is less strict about side effects and promotes the use of different tooling than what you'd use in a pure FP Scala approach. I find myself having to re-writing code and clients wrapped in effects, back to "basic" Scala, in what feels like a step backwards.

What makes Flink great is it's ability to do (stateful) stream processing at large scale (I have professional experience with large scale Flink clusters), but I have found myself liking writing pure functional code and IMO, the ideal scenario would be to be able to run Flink on something like a Blaze backend. Flink was never intended to have a swappable backend (like http4s was) so it's likely intertwined with Akka and futures more than what's good, although I do see that the Flink developers have been diligent about using interfaces to create nice abstractions in the code base. I wonder if it might be possible to fork Flink and try and run it using Blaze and Cats Effects (or even better effect agnostic to allow for Zio etc).

Have you ever thought about running Flink on a different backend?

ariskk commented 2 years ago

Stephan Ewen and his team are absolute experts in concurrency. Flink does an insane amount of work to very efficiently manage memory, buffers and network connections on top of a threadpool with multiple threads. I wouldn't attempt to try to mess with that.

What would be possible is to manage application level effects (eg see orderedMapAsync in flink4s) using a different effect system (Eg ZIO or cats effect). That's possible and could potentially enable nicer APIs at the expense of a second thread pool competing with Flink's in the same physical infra.