weavejester / lein-ring

Ring plugin for Leiningen
Eclipse Public License 1.0
501 stars 100 forks source link

Jar collision: duct/core VS ragtime/core #216

Closed dpiliouras closed 3 years ago

dpiliouras commented 3 years ago

Hi there,

Consider the following (minimal) project.clj:

(defproject duct-bare "0.1.0-SNAPSHOT"
  :description "FIXME: write description"
  :url "http://example.com/FIXME"
  :min-lein-version "2.0.0"
  :dependencies [[org.clojure/clojure "1.10.1"]
                 [duct/core "0.8.0"]
                 [duct/module.sql "0.5.0"] ;; pulls in ragtime/core
                 [duct/module.logging "0.4.0"]]
  :plugins [[duct/lein-duct "0.12.1"]
            [lein-ring "0.12.5"]]
  :main ^:skip-aot duct-bare.main
  :resource-paths ["resources" "target/resources"]
  :prep-tasks     ["javac" "compile" ["run" ":duct/compiler"]]
  :middleware     [lein-duct.plugin/middleware]
  :ring {:handler duct-bare.main/dummy-handler ;; just a `def` in main.clj
         :init    duct-bare.main/-main
         :uberwar-name "foo"}
  :profiles
  {:dev  [:project/dev :profiles/dev]
   :repl {:prep-tasks   ^:replace ["javac" "compile"]
          :repl-options {:init-ns user}}
   :uberjar {:aot :all}
   :profiles/dev {}
   :project/dev  {:source-paths   ["dev/src"]
                  :resource-paths ["dev/resources"]
                  :dependencies   [[integrant/repl "0.3.1"]
                                   [eftest "0.5.7"]]}})

lein ring uberwar results in

java.util.zip.ZipException: duplicate entry: WEB-INF/lib/core-0.8.0.jar

At first I thought this was weird, but then I did lein classpath, and searched for core-0.8.0.jar. Turns out, it's not weird at all...

/.m2/repository/ragtime/core/0.8.0/core-0.8.0.jar /.m2/repository/duct/core/0.8.0/core-0.8.0.jar

I think the underlying issue is that neither of these projects specify a :jar-name, and so the default applies (project+version w/o group-id). Other similarly named projects don't have this issue (e.g. ring-core, buddy-core, hugsql-core etc). Those jars appear with their full names:

/.m2/repository/ring/ring-core/1.8.2/ring-core-1.8.2.jar /.m2/repository/buddy/buddy-core/1.9.0/buddy-core-1.9.0.jar /.m2/repository/com/layerware/hugsql-core/0.5.1/hugsql-core-0.5.1.jar

At this point you might be wondering, why I am reporting this here instead of duct-core and ragtime-core. First of all, I'm not 100% that my theory stands. It seems like a pretty big issue to me, and I would expect others to have tripped over this, but google is not exactly helpful here...I genuinely seem to be the first person facing this (in the context of duct), which usually is not a good sign. Secondly, even if lein-ring requires no change after all (very likely according to my theory), I feel others that may face this in the future will come looking for clues here first - at least that's what i did. Finally, you are the maintainer of all three projects in question here, so I thought grabbing your attention once is sensible. We can always tidy things up later.

I'd love to know what you think...

Many thanks in advance

weavejester commented 3 years ago

Thanks for the report. It sounds like the solution is to fix lein ring uberwar to include the group name of the artifact in the jar filename.

dpiliouras commented 3 years ago

That would be fantastic! 👍

dpiliouras commented 3 years ago

Hi again,

I attempted to pinpoint where the problem might occur, and I suspect it's this line. Given my original report, it would seem that dir-path ends up being nil, which comes straight from(.getParent jar-file). If that wasn't returning nil, then I expect the jar-entry to be WEB-INF/lib/Users/.../.m2/repository/ragtime/core/0.8.0core-0.8.0.jar. That's because of the following:

(-> "/Users/.../.m2/repository/ragtime/core/0.8.0/core-0.8.0.jar" io/file .getParent)
=> "/Users/.../.m2/repository/ragtime/core/0.8.0"
(str *1 "core-0.8.0.jar") ;; as shown on that line
=> "/Users/.../.m2/repository/ragtime/core/0.8.0core-0.8.0.jar"

What do you think?

weavejester commented 3 years ago

It's possible. The only way to be sure that's the problem would be to implement a fix and see if it works.

dpiliouras commented 3 years ago

Turns out the issue is with war/in-war-path, and in particular the call to .relativize(). Observe the following:

(let [jar-file (io/file "/Users/whatever/.m2/repository/ragtime/core/0.8.0/core-0.8.0.jar")
      dir-path (.getParent jar-file)]
  (war/in-war-path "WEB-INF/lib/" dir-path jar-file))
=> "WEB-INF/lib/core-0.8.0.jar"

From the JavaDocs:

  1. If either this URI or the given URI are opaque, or if the scheme and authority components of the two URIs are not identical, or if the path of this URI is not a prefix of the path of the given URI, then the given URI is returned.

  2. Otherwise a new relative hierarchical URI is constructed with query and fragment components taken from the given URI and with a path component computed by removing this URI's path from the beginning of the given URI's path.

It's obviously case 2 we're seeing here. I'll properly clone the project and try a few things tomorrow...

dpiliouras commented 3 years ago

Hi there,

I 've cloned the repo, and did a bit of research on WAR/MAINFEST files. Before attempting to write any code I want to make sure we're both on the same page. Please let me know if you disagree with any of the following:

So to summarise, we're trying to find the groupId, so that we can prepend it to the jar filename. We cannot rely on the path structure of each jar, because the :library-path hook allows for arbitrary paths to be provided, which could ultimately look very different that the local .m2 one. We also cannot rely on the jar file's manifest, because Implementation-Vendor-Id is not a mandatory attribute.

Given the above, I suspect that it will not be straight-forward to sort this out. I'd like to hear your thoughts before spending any more time on this, as I may be missing something big.

Many thanks in advance...

ps: purely an implementation question, but could you perhaps shed some light on the rationale behind the relativize() call? Given that it always receives the parent directory and the (full) jar-file path, the call to relativize() will always return the same thing File.getName would return. I'm just wondering...

dpiliouras commented 3 years ago

Hi again,

Assuming that all jars we're trying to package have a pom.properties inside them, here is an alternative uberwar/jar-entries impl:

(defn- find-pom-properties
  [jar-entries]
  (some
    (fn [^JarEntry je]
      (let [path-in-jar (.getName je)]
        (when (str/ends-with? path-in-jar "pom.properties")
          path-in-jar)))
    jar-entries))

(defn jar-entries* [war project]
  (doseq [^java.io.File jar-file* (jar-dependencies project)]
    (with-open [jar-file (JarFile. jar-file*)]
      (let [entries (enumeration-seq (.entries jar-file))
            pom-properties-jar-path (find-pom-properties entries)
            pom-properties-full-path (some->> pom-properties-jar-path
                                              (str "jar:file:" (.getAbsolutePath jar-file*) "!/"))
            {:strs [groupId artifactId version]} (when pom-properties-full-path
                                                   (with-open [in (.openStream (URL. pom-properties-full-path))]
                                                     (->> (doto (Properties.) (.load in))
                                                          (into {}))))
            war-path (->> (if (and groupId artifactId version)
                            [groupId \- artifactId \- version ".jar"] ;; found pom.properties - reconstruct the jar name to avoid collisions (issue #216)
                            [(.getName jar-file*)])                   ;; fallback to using the original jar name (best-effort approach)
                          (apply str "WEB-INF/lib/"))]
        (war/file-entry war project war-path jar-file*)))
    ))

Notice that it no longer calls war/in-war-path, and that it completely ignores the original jar-name (i.e. it rebuilds it from scratch). Something like this could work, but similarly to my earlier message, I'm not entirely sure that bundling a pom.properties file is mandatory. I suspect it isn't...

[EDIT]: I can confirm that the above generates war-paths like WEB-INF/lib/ragtime-core-0.8.0.jar - as opposed to WEB-INF/lib/core-0.8.0.jar. I can't emphasize enough that this won't work if there is no pom-properties in the JAR. I guess we can special case that.

dpiliouras commented 3 years ago

This is my refactored version of uberwar/jar-entries which makes contains-entry? redundant:

(defn jar-dependencies [project]
  (for [pathname (get-classpath project)
        :when (str/ends-with? pathname ".jar")]
    (io/file pathname)))

(def ^:private javax-servlet?  #{"javax/servlet/Servlet.class"})
(def ^:private pom-properties? #(str/ends-with? % "pom.properties"))

(defn- war-path-for-jar
  [^java.io.File jar]
  (with-open [jar-file (JarFile. jar)]
    (let [javax&pom (->> (.entries jar-file)
                         enumeration-seq
                         (map (fn [^JarEntry je] (.getName je)))
                         (filter (some-fn javax-servlet? pom-properties?)))] ;; 2 elements maximum
      ;; Servlet container will have it's own servlet-api impl
      (when-not (some javax-servlet? javax&pom)
        ;; we've confirmed that one of the two possible files is missing,
        ;; so if we find one (via `first`), it will be the pom.properties
        (let [pom-properties-full-path (some->> (first javax&pom)
                                                (str "jar:file:" (.getAbsolutePath jar) "!/"))
              {:strs [groupId artifactId version]} (when pom-properties-full-path
                                                     (with-open [in (.openStream (URL. pom-properties-full-path))]
                                                       (->> (doto (Properties.) (.load in))
                                                            (into {}))))]
          (->> (if (and groupId artifactId version)
                 [groupId \- artifactId \- version ".jar"] ;; found pom.properties - reconstruct the jar name to avoid collisions (issue #216)
                 [(.getName jar)])                         ;; fallback to using the original jar name (best-effort approach)
               (apply str "WEB-INF/lib/")))))))

(defn jar-entries
  "Populates the 'WEB-INF/lib/' directory of the given <war> with this project's dependencies (jars).
   If these contain a 'pom.properties' file, they will be copied into the war named as
   $GROUP-$ARTIFACT-$VERSION.jar (see `war-path-for-jar`), otherwise using their original filename."
  [war project]
  (doseq [jar-file (jar-dependencies project)]
    (war/file-entry war project (war-path-for-jar jar-file) jar-file)))

(comment
  ;; will only work under *NIX
  (let [clojure-jar (-> (System/getProperty "user.home")
                        (io/file ".m2/repository/org/clojure/clojure/1.10.1/clojure-1.10.1.jar"))]
    (= "WEB-INF/lib/org.clojure-clojure-1.10.1.jar" ;; <group-id>-<artifact-id>-<version>.jar
       (war-path-for-jar clojure-jar))
    )
  )

As you can see for jars without a pom.properties, behavior stays the same (i.e. the jar name stays as is). Let me know of your thoughts, and we can take it from there.

dpiliouras commented 3 years ago

For reference: https://github.com/dpiliouras/lein-ring/blob/master/src/leiningen/ring/uberwar.clj

dpiliouras commented 3 years ago

Hi again,

Hope you're staying sane & healthy in these crazy times.

Would you be interested in a PR for this, or should I look into alternatives for building our uberwars? I'm just asking because this has been kind of a blocker for us, and it's not the easiest thing to explain to my boss either. If you don't have the time to look at it, that's totally understandable, I will just need to figure something out. Similarly, if you think that the approach I'm taking won't work (or is sub-optimal), I would be happy to refactor, given some feedback.

Many thanks in advance 👍

weavejester commented 3 years ago

I'd certainly accept a PR, but unfortunately I haven't had time to look into this, so I don't know how long it'll take me to review. However, you can always deploy your fork to Clojars under a unique name (e.g. dipiliouras/lein-ring) to act as a stopgap until it's merged into the lein-ring package itself.

dpiliouras commented 3 years ago

Hi there,

I have deployed my own version of lein-ring (per your suggestion), which incorporates the fix(es) as seen in the PR above. The good news is that it appears to solve the issue. I am able to generate a 33mb uberwar containing bothduct.core and ragtime.core (same version). The unfortunate thing is that I haven't been able to successfully deploy that artifact on tomcat, but I am almost certain that the problem I'm having does NOT relate to my fixes.

I will create a separate issue/question on the duct page , but for the sake of completeness, here is the gist:

The only different between the first two VS the third, is that the duct-config for the third has no jetty component, and it's that component that references the top-level handler (via the :handler key). In the absence of a jetty component, the top-level handler remains un-initialised (its init-key method is never called). I've added log-statements in strategic places and I can see that the -main method (which as far as lein-ring is concerned is the servlet :init function) is indeed being called, but then the logging stops (no other component is initialised). So the situation begs the question:

Am I right in thinking that if a (non-deamon) duct-component is not referenced by any other components, duct will consider it unneeded, and not initialise it?

As always many thanks in advance...

weavejester commented 3 years ago

Am I right in thinking that if a (non-deamon) duct-component is not referenced by any other components, duct will consider it unneeded, and not initialise it?

By default, yes, because by default the -main function restricts execution to keys that derive from :duct/daemon:

(defn -main [& args]
  (let [keys     (or (duct/parse-keys args) [:duct/daemon])
        profiles [:duct.profile/prod]]
    (-> (duct/resource "foo/config.edn")
        (duct/read-config)
        (duct/exec-config profiles keys))))

If you want other keys to be initiated, then you should supply those keys instead.

dpiliouras commented 3 years ago

Yeah, that was it! All I had to do in the end, was to define a new :init function, which simply calls -main with a String argument (the top-level handler component). I may be wrong, but I wasn't able to find a single 'duct-on-tomcat' example doing this. I did find a slack discussion at some point where the -main was being shown as the :init fn, but that was the closest thing I could find. Honestly, I can't thank you enough for this...

Summing up, you can now consider/review my PR with a greater level of confidence than a few days ago, as I have confirmed that it works (in two different projects actually). 👍

weavejester commented 3 years ago

Duct was designed to be run in its own server, so you're breaking new ground by running it under Tomcat. Also, thanks for the PR - I'll try to find some time to look through it, but at a glance it looks like it'll need a fair bit of cleaning up, so the review will be lengthy.