sbt / sbt-assembly

Deploy über-JARs. Restart processes. (port of codahale/assembly-sbt)
MIT License
1.94k stars 223 forks source link

Assembly - Run the keep rule on Uber jar... #488

Open shanielh opened 1 year ago

shanielh commented 1 year ago

Instead of running them jar by jar, this makes the keep rule much more usable.

I'm willing to make changes to this PR if needed. This should fix issue #265 and specifically this comment.

eed3si9n commented 1 year ago

@shanielh Thanks for the contribution. Do you think it would be possible to demonstrate the corner case as a test? Shading tests tends to be expressed as scripted test in this directory - https://github.com/sbt/sbt-assembly/tree/develop/src/sbt-test/shading

shanielh commented 1 year ago

Yes, I will try to do it

shanielh commented 1 year ago

Hi @eed3si9n , I've added test and modified the PR a bit because I didn't want to break compatibility with older tests and users of this plugin.

Note that the PR isn't working, the main code is in https://github.com/sbt/sbt-assembly/pull/488/files#diff-dcd028f99421b60751076b6597301090dcc7656497954ee32bf3e60b029258b0R575 and I'm not sure why, but I'm not getting items to exclude, I've noticed you're the maintainer of jarjar-abrams too, so you might have some helpful insights :)

eed3si9n commented 1 year ago

I've noticed you're the maintainer of jarjar-abrams too, so you might have some helpful insights :)

In theory I am, but note that my contribution to the project is coming up with the name and making the repo, and I don't really know much about jarjar or shading. Having said that, if there's an issue with keep rule, I wonder if the layer that we should fix is at jarjar-abrams level instead of adding a new setting.

Also sbt-assembly has been using jarjar (Jar Jar Links), which was originally designed for Java to handle shading, but call graph walking in Scala is not completely trivial, so I'm not surprised that "keep" in general never correctly functioned in real-life applications.

shanielh commented 1 year ago

I've noticed you're the maintainer of jarjar-abrams too, so you might have some helpful insights :)

In theory I am, but note that my contribution to the project is coming up with the name and making the repo, and I don't really know much about jarjar or shading. Having said that, if there's an issue with keep rule, I wonder if the layer that we should fix is at jarjar-abrams level instead of adding a new setting.

Also sbt-assembly has been using jarjar (Jar Jar Links), which was originally designed for Java to handle shading, but call graph walking in Scala is not completely trivial, so I'm not surprised that "keep" in general never correctly functioned in real-life applications.

So I tried to see what my class compiles into in the test, seems like the scala-compiler / java-compiler inlined the reference to a static final field:

public class keep.Keeped {
  public void main(java.lang.String[]);
    Code:
       0: ldc           #13                 // String GMT
       2: astore_2
       3: return

  public keep.Keeped();
    Code:
       0: aload_0
       1: invokespecial #22                 // Method java/lang/Object."<init>":()V
       4: return
}

My bad, I'll try to create an instance of a class 😄

shanielh commented 1 year ago

After debugging a bit, I got to an impression that the KeepProcessor doesn't do what I would like it to do, for instance, with the Above code, this is the roots and the depends:

[info] [info] kp.roots:Buffer(keep/Keeped)
[info] [info] kp.depends:Map(keep/Keeped -> Set(Bar, org/apache/commons/lang3/tuple/ImmutablePair, Foo, scala/Predef$, scala/reflect/ScalaSignature))

Let's leave the "Bar" and "Foo" strings out of the equation because I'm not sure how the class visitor works and how it figures references to other classes.

The code for getExcludes():

    public Set<String> getExcludes() {
        Set<String> closure = new HashSet<String>();
        closureHelper(closure, roots);
        Set<String> removable = new HashSet<String>(depend.keySet());
        removable.removeAll(closure);
        return removable;
    }

First, it would be much easier to change it to getIncludes, because what we would like to include is the roots and their (recursive) dependencies, the process method should be also changed to visit all classes, because you might walk over a dependency before walking over a root - that way we would have a map of all classes and their dependencies, and getting the recursive dependencies should be easy once you've reached a root - after doing that, we can also keep the method signature for getExcludes.

Would you like me to open a PR for jarjar-abrams? If you do, please let me know what changes you want me to do

UPDATE Opened a PR for jarjar-abrams https://github.com/eed3si9n/jarjar-abrams/pull/28

shanielh commented 1 year ago

Hi @eed3si9n , can you review this PR too? Thanks! 😄

Note that I had to publishLocal the jarjar-abrams library in order to test it, so you might want to publish jarjar-abrams with latest changes before merging this.

eed3si9n commented 1 year ago

Sorry about the late reply, and thanks for the test. In general, if we need to do anything at sbt-assembly level, doesn't that mean that there's something wrong with jarjar-abrams?

shanielh commented 1 year ago

Sorry about the late reply, and thanks for the test. In general, if we need to do anything at sbt-assembly level, doesn't that mean that there's something wrong with jarjar-abrams?

sbt-assembly uses jarjar on each dependency-jar separately, The only reason I would think that I would like to add a Keep rule to a specific class/es in a dependency jar, is that I would like to enforce usage of a specific package/sub-system inside that dependency. (e.g. add ShadeRule.keep(..).inLibrary()), I don't think I see a good reason for people to use ShadeRule.keep(..).inAll because that would run the same keep-rule on all dependencies and the src-code, so maybe I would remove that in future version (I don't have a way to know if people uses that and for what reason).

I think that using a keep rule on the final jar is much easier and less error prone, and that's the new option I've added to sbt-assembly.

shanielh commented 1 year ago

@eed3si9n Any chance you're going to review this?

eed3si9n commented 1 year ago

Yea. it's on my todo.

shanielh commented 1 year ago

@eed3si9n - So, what you would like me to do in order to merge this? To allow only "inAll" in Keep rule and remove the new definition? This will break compilation compatibility (ShadeRule.keep("...").inLibrary / .inProject will not compile), but it's much more logical.

eed3si9n commented 1 year ago

@eed3si9n - So, what you would like me to do in order to merge this? To allow only "inAll" in Keep rule and remove the new definition? This will break compilation compatibility (ShadeRule.keep("...").inLibrary / .inProject will not compile), but it's much more logical.

Yea. We can throw exception when we find bad shade rules with friendly error message explaining what's going on, and run the actual keep rule for the assembled JAR.