Open salvalcantara opened 2 years ago
Hi @salvalcantara,
The motivations behind the very existence of flink4s
are twofold:
combine
, unorderdMapAsync
and others.[3] is the default option if you want to extend a third party library. biMapWithState
and combine
have been the most used methods in my past projects and were implemented using this pattern. It is true that this library could just provide an implicit class wrapper (or Scala 3.x extension methods) and be a lot more lightweight.
This brings us to your original question (1). It is true to you can use the Java classes directly. This has been possible for a while, but yet again most projects use the Scala wrapper. A couple of reasons off the top of my head:
fromCollection
, executeAndCollect
).def processMyDataStream[T](myStream: DataStream[T])
, and you do not implement ResultTypeQueryable
you will get a runtime exception. The Scala wrapper enforces that with an implicit typeInfo: TypeInformation[T]
and thus you don't risk runtime errors.So to summarize:
That said, very good food for thought! I will think about from your perspective and comment back!
Thank you!
Hi @ariskk, thanks for your replies. I've got some more questions regarding the scope of this library:
flink-streaming-scala
(DataStream API) and maybe also that in flink-scala
(in particular it contains type info stuff).flink-streaming-scala
. Which is the overlap between these two projects?flink-type-info
? It might make sense given the fact that the latest Scala version supported by Flink will be 2.12, so that might make the transition smoother (?).Hi @salvalcantara !
I haven't seen flink-type-info
before. It seems to be focusing on deriving TypeInformation instances using shapeless (instead of the macro Flink provides). Shapeless doesn't support Scala 3.x currently.
At a glance, I can't see any overlap between the two projects. flink-type-info
seems to be focusing solely on generating serializers. In general, due to my experience here I am very very reluctant to use any "magic" in the TypeInformation
generation process. For the last few years I have been using a TypeInfo trait in my projects.
Re Scala 2.13, there are two options:
-Ytasty-reader
flag a Scala 2.13.4+ project can depend on a Scala 3 artefact. For more info: https://docs.scala-lang.org/scala3/guides/migration/compatibility-classpath.htmlBTW @ariskk flink-type-info
seems to have disappeared (???), but there is the magnolia-based equivalent: https://github.com/findify/flink-adt, which I guess would play the same role.
Flink4s requires an in implicit TypeInformaton
instance to work, but it has no opinion as to how this is generated. Flink4s and flink-adt should work together fine (in 2.13.4+ at least).
Hey @ariskk, have you seen this: https://github.com/findify/flink-scala-api? It clearly overlaps with this project and seems way more advanced. I came across this here: https://lists.apache.org/thread/3ghw0cvkftv349j9x4j3h1twzwt75fwj.
Hey @salvalcantara
I was the first to star flink-scala-api
. It is a pure port of the existing code, so likely easier to migrate.
My aim for flink4s was to create native Scala APIs that enhance what's already there.
That said, I no longer have a use case for it as I have left the Flink project I was working on.
If there is no interest from contributors, I don't see myself maintaining this for long. I will likely archive the project and point people to flink-scala-api
at some point.
@ariskk I think it would make more sense to unite forces in flink-scala-api
(and flink-adt
). Wouldn't it be possible to add whatever extensions on top of flink-scala-api
, which already covers all the basics, allowing flink4s
to narrow down its focus?
@ariskk Another thing I noticed is that you use case classes for everything (e.g., DataStream
) while the official flink-scala
(as well as flink-scala-api
) uses regular classes. Using case classes might be problematic if you want to use inheritance (e.g., if you want KeyedStream
to extend DataStream
you cannot and would need to handle it in a different way).
Hi @ariskk,
I am learning Scala 3 and considering the transition from Scala 2 in the context of Flink. Since this is yet a nascent (but exciting!) project I would like to ask a few questions about the overall design at this early stage:
Is it really necessary to wrap the Java classes? Why not using the Java API classes directly? For example, I took a quick look at [1] and overall it looks good (without any scala wrappers whatsoever if I'm not mistaken). I guess the idea is to enhance ergonomics/make things more Scala idiomatic (similar to [2]) but at first sight for this case in particular I cannot see a huge win. I think the advantages should be clearly stressed/demonstrated in the readme, so that justification is out of the way.
Have you considered other design options (other than composition)? Namely [3]:
REFERENCES: