ariskk / flink4s

Scala 3.x wrapper for Apache Flink
MIT License
49 stars 10 forks source link

Overall design questions #17

Open salvalcantara opened 2 years ago

salvalcantara commented 2 years ago

Hi @ariskk,

I am learning Scala 3 and considering the transition from Scala 2 in the context of Flink. Since this is yet a nascent (but exciting!) project I would like to ask a few questions about the overall design at this early stage:

  1. Is it really necessary to wrap the Java classes? Why not using the Java API classes directly? For example, I took a quick look at [1] and overall it looks good (without any scala wrappers whatsoever if I'm not mistaken). I guess the idea is to enhance ergonomics/make things more Scala idiomatic (similar to [2]) but at first sight for this case in particular I cannot see a huge win. I think the advantages should be clearly stressed/demonstrated in the readme, so that justification is out of the way.

  2. Have you considered other design options (other than composition)? Namely [3]:

    • Inheritance/Subtyping
    • Extension Methods/Type Classes

REFERENCES:

ariskk commented 2 years ago

Hi @salvalcantara,

The motivations behind the very existence of flink4s are twofold:

[3] is the default option if you want to extend a third party library. biMapWithState and combine have been the most used methods in my past projects and were implemented using this pattern. It is true that this library could just provide an implicit class wrapper (or Scala 3.x extension methods) and be a lot more lightweight.

This brings us to your original question (1). It is true to you can use the Java classes directly. This has been possible for a while, but yet again most projects use the Scala wrapper. A couple of reasons off the top of my head:

So to summarize:

That said, very good food for thought! I will think about from your perspective and comment back!

Thank you!

salvalcantara commented 2 years ago

Hi @ariskk, thanks for your replies. I've got some more questions regarding the scope of this library:

ariskk commented 2 years ago

Hi @salvalcantara ! I haven't seen flink-type-info before. It seems to be focusing on deriving TypeInformation instances using shapeless (instead of the macro Flink provides). Shapeless doesn't support Scala 3.x currently. At a glance, I can't see any overlap between the two projects. flink-type-info seems to be focusing solely on generating serializers. In general, due to my experience here I am very very reluctant to use any "magic" in the TypeInformation generation process. For the last few years I have been using a TypeInfo trait in my projects.

Re Scala 2.13, there are two options:

salvalcantara commented 2 years ago

BTW @ariskk flink-type-info seems to have disappeared (???), but there is the magnolia-based equivalent: https://github.com/findify/flink-adt, which I guess would play the same role.

ariskk commented 2 years ago

Flink4s requires an in implicit TypeInformaton instance to work, but it has no opinion as to how this is generated. Flink4s and flink-adt should work together fine (in 2.13.4+ at least).

salvalcantara commented 2 years ago

Hey @ariskk, have you seen this: https://github.com/findify/flink-scala-api? It clearly overlaps with this project and seems way more advanced. I came across this here: https://lists.apache.org/thread/3ghw0cvkftv349j9x4j3h1twzwt75fwj.

ariskk commented 2 years ago

Hey @salvalcantara I was the first to star flink-scala-api. It is a pure port of the existing code, so likely easier to migrate. My aim for flink4s was to create native Scala APIs that enhance what's already there. That said, I no longer have a use case for it as I have left the Flink project I was working on. If there is no interest from contributors, I don't see myself maintaining this for long. I will likely archive the project and point people to flink-scala-api at some point.

salvalcantara commented 2 years ago

@ariskk I think it would make more sense to unite forces in flink-scala-api (and flink-adt). Wouldn't it be possible to add whatever extensions on top of flink-scala-api, which already covers all the basics, allowing flink4s to narrow down its focus?

salvalcantara commented 2 years ago

@ariskk Another thing I noticed is that you use case classes for everything (e.g., DataStream) while the official flink-scala (as well as flink-scala-api) uses regular classes. Using case classes might be problematic if you want to use inheritance (e.g., if you want KeyedStream to extend DataStream you cannot and would need to handle it in a different way).