akka / alpakka

Alpakka is a Reactive Enterprise Integration library for Java and Scala, based on Reactive Streams and Akka.
https://doc.akka.io/docs/alpakka/current/
Other
1.26k stars 645 forks source link

Sharing code between google cloud modules #773

Closed francisdb closed 3 years ago

francisdb commented 6 years ago

Something to handle after https://github.com/akka/alpakka/pull/764 and https://github.com/akka/alpakka/pull/650 have been merged

This would unify configuration / session handling for all google cloud related modules. For amazon we are using their sdk but @tg44 pointed out that it would be a bad idea for the google cloud sdk

We should still keep the possibility to configure manually if you want to use multiple service accounts for example

Previous discussion: https://github.com/akka/alpakka/pull/650#issuecomment-364968964

tg44 commented 6 years ago

Related to this: Use the jwt-core, or do it by hand? (fcm uses jwt-core, pubsub uses byhand) Pro: less code Con: more deps (signed jar cause of the bouncycastle dep)

francisdb commented 5 years ago

@ennru would you be open for a shared (internal?) module google-common that is used by fcm/storage/pubsub? Otherwise we should close this.

2m commented 5 years ago

I think shared module sounds good. It will still have to be published as a jar, but all of the classes in that jar can be inside the impl package.

ennru commented 3 years ago

https://github.com/akka/alpakka/pull/2451 copied in code from another Google connector.

A shared module would make sense to not duplicate code but on the other hand, users will need to use the exact same Alpakka version for different connectors, which is not required this far.

tg44 commented 3 years ago

Do we need that? Can't we sbt magic the module files to the given submodules at compile time? There are pros and cons with this solution, but in that case we could write/test code as a module, but we don't need to publish it, and all submodules would use their own implementations.

gkatzioura commented 3 years ago

What it seems to be very valuable would be to extract the http code that happens behind the scenes accompanied with the re-authentication/token-refresh, and have a library or even a http-connector (that handles the token refresh) so that if you cannot find a connector for a GCP component you can roll one out on your own easily. For example if I want to use a StackDriver connector right now I would copy the code from p:google-cloud-bigquery or p:google-cloud-storage because they already have the code needed and the GCP api does not differ a-lot. If an http base code is out there will make it easier, like a toolkit for GCP connectors. This could apply to new GCP features since you expect them to use the usual GRPC and HTTP APIs

tg44 commented 3 years ago

I get it, but as @ennru said, at this point the GCP-authenticator version would force you to use the same version in your google based alpakka modules (which could be good or bad). I think if we decide that this is a problem, and also decide that rolling out a GCP-auth lib is not in the scope of the alpakka codebase, we have a second way to share the code between modules. We can write an sbt-task that can copy the compiled "shared" code to the given google based alpakka libs, so in this case we don't need to roll out a lib, bcs we compile time hack together all of the things. If you start a new google based alpakka module, you just add the precompile-copy step to the sbt.

I'm not convinced that the sbt hack is the way, but I also see that rolling out a lib could be problematic in some cases.

Also, I think the GCP-auth code is not an easy connector. The backpressure, and "waiting" branches could couse troubles. For ex; If we have a Source[Endpoint], Source[GoogleToken] and a Zip, we can get an expired token after the zip (but before the api call). At best we can write a Flow[Endpoint, Result, NotUsed], but here we need to handle the inner parallelism, and we need to somehow make configurable if we use one token generator for all of our streams or not.

ennru commented 3 years ago

Thank you @tg44 and @gkatzioura for your ideas around this. I thought about some code generation-like magic to paste the code into the different Google cloud connectors. But making the build more involved seems dangerous from a maintenance point of view.

A separate library to plug in the authentication seems attractive, especially if it gets improved to what we have today.

seglo commented 3 years ago

I've thought about this a bit and I think adding a new sub project to alpakka to host this shared code is good enough. It may slightly complicate the release process. The sub-project would be built and released along with Alpakka itself. We'll have to take care that shared code works across all google connectors every time it's updated.

seglo commented 3 years ago

If @tg44 or @armanbilge are interested feel free to reference this issue and put a PR up, but let's tackle it after the BigQuery connector PR is merged.

seglo commented 3 years ago

Implemented in #2613