apache / incubator-heron

Apache Heron (Incubating) is a realtime, distributed, fault-tolerant stream processing engine from Twitter
https://heron.apache.org/
Apache License 2.0
3.65k stars 598 forks source link

Joshfischer/3774/pom dependencies #3778

Closed joshfischer1108 closed 2 years ago

joshfischer1108 commented 2 years ago

Does this list the necessary deps?

nicknezis commented 2 years ago

The pom.xml templates are used to generate pom files for various artifacts. Not sure if this scripts needs to be changed. For example, are the dependencies the same for heron-api and heron-spi? https://github.com/apache/incubator-heron/blob/master/release/maven/maven-pom-version.sh

Edit: This script calls the above script. But I think we can keep the changes in the above script.

nicknezis commented 2 years ago

Actually, we need to update this script to pull the correct artifacts. For example, I don't think it should be referencing api-shaded.jar.

nicknezis commented 2 years ago

Also, should these two lines be updated to reference the final Heron API dependency (and not the individual Topology/Streamlet jars that we don't publish)?

https://github.com/apache/incubator-heron/blob/ebd7ceaeb7cb4aeddf21e8a51a233d53e2afca0d/examples/src/java/BUILD#L10

nicknezis commented 2 years ago

Also, this insistence on random "copy" genrules is driving me crazy. Why are they there? It seemed 6 years ago it used to be a jar_jar shading process, but that was removed. But instead of removing the target, they changed it to a copy which renames the unshaded to shaded.

Just found this old PR using Git Blame. Happy to finally shed some light on this. But feels like we can clean this up. Doesn't have to be in this PR, but in a futture PR we should remove these random copies. https://github.com/apache/incubator-heron/blob/ebd7ceaeb7cb4aeddf21e8a51a233d53e2afca0d/examples/src/java/BUILD#L17-L22

joshfischer1108 commented 2 years ago

The pom.xml templates are used to generate pom files for various artifacts. Not sure if this scripts needs to be changed. For example, are the dependencies the same for heron-api and heron-spi? https://github.com/apache/incubator-heron/blob/master/release/maven/maven-pom-version.sh

Edit: This script calls the above script. But I think we can keep the changes in the above script.

I didn't even think about the heron-spi jar. Ok, thanks for the feedback. Let me do a bit of thinking on this.

joshfischer1108 commented 2 years ago

I still have some questions on this branch, but I think this is good enough to start conversations. Lots of a unsureness with in reference to the heron-functional-api vs the heron-api. But I think this will work for us to get passed a vote. LMK your thoughts and if anyone wants any changes.

nicknezis commented 2 years ago

Awesome, as I wrap up the Python UI/Tracker PR, I'll transition to reviewing this and testing it.

surahman commented 2 years ago

@joshfischer1108 I do not know enough about this area of the build scripts and dependency creation to be able to give any insights/reviews that you could comfortably lean on. The best people for this are @nicknezis and @nwangtw.

joshfischer1108 commented 2 years ago

Ping

nicknezis commented 2 years ago

I'm working it. Have a small commit to push back. But testing the resulting Heron API jar in a separate Heron analytic Gradle project to verify the build works. I haven't been able to run it yet, but that's my next step. Should hopefully have it done today.

huijunwu commented 2 years ago

In my opinion, release/maven/heron-[no/with]-kryo.template.pom DEPS was put to a minimum to avoid transitive dependency. Basically, those jars expose just heron-api/heron-spi. If users' job needs other dependencies, they should explicitly specify them in their job BUILD rather than hope heron to help them import dependencies.

nicknezis commented 2 years ago

But we were forcing API users to have to use the embedded classes (which could conflict). By removing the fat jar and providing the actual list of dependencies, this provides the ability to do proper dependency resolution, and provides the analytic developer more flexibility if they want to use different dependencies.

BTW, this PR had some issues found in tested, I have an updated PR that I'm testing that fixes things a bit. https://github.com/apache/incubator-heron/pull/3844