trellis-ldp / trellis

Trellis is a platform for building scalable Linked Data applications
https://www.trellisldp.org
Apache License 2.0
105 stars 21 forks source link

Add support for JPMS #66

Closed christopher-johnson closed 5 years ago

christopher-johnson commented 6 years ago

I am experimenting with supporting the module system with trellis. I will use this as a tracking task. One can view the progress here

Here is what I have discovered so far: 1) Error occurred during initialization of boot layer java.lang.module.FindException: Unable to derive module descriptor for /home/christopher/.gradle/caches/modules-2/files-2.1/org.apache.geronimo.specs/geronimo-annotation_1.2_spec/1.0-alpha-1/804747c40f1145ae9cc13cb9e927fca82e6e3c1b/geronimo-annotation_1.2_spec-1.0-alpha-1.jar Caused by: java.lang.IllegalArgumentException: geronimo.annotation.1.2.spec: Invalid module name: '1' is not a Java identifier This seems to be because of an "illegal" artifactID ("geronimo-annotation_1.2_spec" has a "."). This is a dependency of apache tamaya. The module name is derived from the jar if there is not an automatic module name in the manifest. Not sure how to move forward with this...but it is a problem for all geronimo specs...maybe file an issue upstream?

2) gradle jar tasks seem to break module-info resolution. Removing them allows the build to proceed. 3) the servicemix bundle wrapper for javax.inject does not resolve as a module. Quick fix is to add javax.inject as a dependency and add it as a requirement instead. Not clear what this will do in OSGI.

christopher-johnson commented 6 years ago

1) is resolved by building fork with locally built geronimo dependency that includes this in pom.xml

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-jar-plugin</artifactId>
                <configuration>
                    <archive>
                        <manifestEntries>
                            <Automatic-Module-Name>org.apache.geronimo.specs.annotation</Automatic-Module-Name>
                        </manifestEntries>
                    </archive>
                </configuration>
            </plugin>

Will suggest this as a fix in geronimo JIRA.

  1. Found new issue with gradle ClassCastException relating to logback and slf4j. See https://github.com/gradle/gradle/issues/2657. This is a substantial blocker, so we will have to wait for this to be fixed to move forward I guess.

What is odd is that gradle does not choke on slf4j-api and logback-classic in ldp-client (???) with a single module build configuration.

acoburn commented 6 years ago

@christopher-johnson I ran into that same gradle issue a while ago when trying to build the various trellis modules. IIRC, this only affected the testing apparatus, which is where logback would come into play. At the time, I also tried using slf4j-api/1.8.0-alpha, though there have been some more recent releases; I basically came to the conclusion that the build tooling (both gradle and maven) is still catching up with the newer Java environments.

christopher-johnson commented 6 years ago

@acoburn thanks for the tip about the testing apparatus. I worked around the ClassCastException by excluding the tests for now.

$ gradle build -x test

and using this branch it is now possible to build trellis as JDK 9 modules!

Notes: 1) a test dependency on activemq-broker and a main on activemq-client is the perfect example of a split package conundrum. The solution that I opted for was to patch the activemq-broker, but that then removes the dependency for the TrellisEventTest. So, this test had to be removed from the build... activemq-all is a split package nightmare. 2) Including jars of tamaya 0.3-incubating is not sustainable. I tried to build master, but that did not work because of a Provider class in the new tamaya.spisupport module which cannot be seen by tamaya.core. (The command to check modules in a jar is java -p <file.jar> --list-modules). Also, you may notice that the version number 0.3-incubating will not work in bnd, so there seem to be a few problems with their OSGI build. Not clear if they intend on supporting module-info.java eventually ... 3) I will follow up on the geronimo issue

christopher-johnson commented 6 years ago
  1. tried running the trellis-app distribution of the JPMS build and discovered another geronimo issue, but with a transitive dependency of activemq-client : geronimo-j2ee-management_1.1_spec-1.0.1.jar. Forking and building a new activemq-client will be a pain...

Including a dependency on activemq-client is a consequence of jms support, no? I remember that it was discussed some time ago about including kafka alternative event systems by default. I am not so sure that activemq is maintainable going forward, and I would rather remove it from the kafka centric trellis-app build that I prefer.

acoburn commented 6 years ago

Including a dependency on activemq-client is a consequence of jms support, no?

That is correct.

I am not so sure that activemq is maintainable going forward, and I would rather remove it from the kafka centric trellis-app build that I prefer.

I think there are various opinions on this. The direction I'd like to see Trellis go involves much more CDI and more OSGi so someone's "application" can be easily assembled with very little code. For now, though, there is also a good case to be made for making the default deployable artifact simple, and that's something JMS has going for it.

It may be worth noting that, after the 0.6 release of the core artifacts, there will be a release of the Kafka-centric Trellis.

christopher-johnson commented 6 years ago

While I agree that CDI and OSGI are important, I have the feeling that developing support for the JPMS should be a first priority. I also prefer to find the libraries that have JPMS issues sooner than later (and notify the developers about them) ...

Note: Item 3) in first comment is resolved by changing:

api group: 'org.apache.servicemix.bundles', name: 'org.apache.servicemix.bundles.javax-inject', version: javaxInjectVersion

to

api group: 'org.glassfish.hk2.external', name: 'javax.inject', version: javaxInjectVersion

dropwizard.metrics transitively depends on multiple versions of javax.inject, so it is also necessary to exclude javax.inject:javax.inject like in trellis-app/build.gradle with this:

 configurations {
    compile {
        exclude group: 'javax.inject', module: 'javax.inject'
    }
}

and trellis-http/build.gradle with this:

configurations {
    api {
        exclude group: 'javax.inject', module: 'javax.inject'
    }
}
christopher-johnson commented 6 years ago

update on this: trellis-app with this JPMS build will run and accept requests. I will do a more complete functional test soon. Runtime access violations seem to lurk in the shadows, so I would not be surprised if the module-info.java files need further (provides .. with, uses, and opens) provisions

Notes: 1) the gradle application plugin start script does not work. See this for modifications 2) At least a couple of PRs are the result of this exercise.

Please review the changes and advise on next steps. Thanks!

christopher-johnson commented 6 years ago

Filed upstream Geronimo spec naming issue here: https://issues.apache.org/jira/browse/GERONIMO-6601

acoburn commented 6 years ago

@christopher-johnson this is great. I have changed the servicemix-wrapped javax-inject dependency to simply use the javax.inject:javax.inject:1 one, and that works just fine. See this commit: https://github.com/trellis-ldp/trellis/commit/e9917fcb95d98cffdef0326169814c6fad503e3d. That should address the issue of split packages, but if there is more to be done for this, I'd welcome suggestions or a PR.

Also, good catch on the package resources in trellis-app. I think that resource can simply be removed; as it is, it is clearly wrong. Would you be able to submit a PR for that? (That way you can verify that it all still works for the JPMS version). There is a similar resource in the trellis-http module, which also is (likely) not needed -- at some point that was moved into the trellis-io-jena module, so removing the template file from the http layer would be good, too. In passing, I'd mention that the HTML display needs a lot of work, and that all may end up being re-written (though it's a pretty low priority at the moment).

Also, I noticed that your changes fixed some typos, omissions and redundancies in the various build.gradle scripts. Feel free to send a PR with any of those fixes -- they can all go right into the codebase.

Ideally, if we can get a commit that only includes the module-info.java files and limited modifications to the build.gradle scripts, then we can push this to a jpms or jdk9 branch on master. That would make it easier to move to a jpms-enabled codebase once the next LTS Java version is released.

christopher-johnson commented 6 years ago

FYI: In testing the JPMS build, I have found a lurking (and silent) issue with org.trellisldp.event that prevented serialization of events. I added

opens org.trellisldp.event to com.fasterxml.jackson.databind;

and it works now.

In order to figure this out, I added a LOGGER to org.trellisldp.event and in the catch block before the return empty() logged the JsonProcessingException message. Might be a useful addition to that module.

Regarding a limited commit for a JPMS branch, we are stuck at the moment having to work around geronimo-spec, which required me to remove the activemq dependency. I submitted a PR to geronimo and it will probably get merged once a few names are changed/confirmed.

The gradle chainsaw "hacks" could be further minimized upstream. One bugaboo is the split packages introduced with the org.apache.httpcomponents and jena osgi components. I do not understand why there needs to be a "non-osgi" version of these libraries?

Otherwise, I feel like the JPMS build is working OK (all of the client tests pass). It might be interesting to do some jmeter tests comparing 0.6.0 configurations (on 1.8 and 1.9). BTW, there is a new 1.8 docker image with the 0.6.0 release artifact available.

christopher-johnson commented 6 years ago

one other thing. I merged the new dropwizard 1.3. For some reason, dropwizard.logging 1.3 now includes logback-core. This is a split package problem for just about everything. An issue should be raised with upstream about that.

christopher-johnson commented 6 years ago

dropwizard issue opened here: https://github.com/dropwizard/dropwizard/issues/2312

christopher-johnson commented 6 years ago

update on this: I have been working with jena upstream to see how to eliminate jena-osgi and its problematic dependencies from the jpms build. In evaluating the jena code base, I have learned a few things about modules, and also that the current structure of the jena "modules" will not work in a jpms build.

Andy has merged both PRs that poke at the problem, but do not solve it. The main issue is what one can call an "ambiguous module", and this is how most of the code in jena is organized. So, for example, org.apache.jena.arq does not actually contain a root package called org.apache.jena.arq but rather "subpackages". This is confusing. The module system (imo correctly) enforces the semantics of a jar/module name and its package root, so the expectation is that org.apache.jena.arq will have a root package org.apache.jena.arq and export org.apache.jena.arq.riot, etc. Unfortunately, changing the code structure breaks all existing imports on the ambiguous package definitions. However, this is what needs to be done to support JPMS, so I created a jena fork called "jena-redux" that has the required code structure (and is easy to build with gradle).

I will continue to work upstream and hope that a compromise can be reached (perhaps with a distinct branch). I also had to fork commons-rdf-jena to change the imports which is relatively easy. However, maintaining forks is not a sustainable direction, but just a short term fix until the larger upstream issues are addressed. JPMS is a big change, but I am convinced that it provides enormous clarity for developers by isolating modules into distinct classpaths. Also, the module-info.java provides an explicit and clear declaration of compilation requirements (that is often fuzzy and hard to interpret because of erroneous build scripts).

The builds are passing for the latest JPMS branch commits, so I guess I can now publish SNAPSHOT artifacts for use by other implementations, like the trellis-client.

acoburn commented 6 years ago

It is very unlikely that the jena-osgi dependency will be removed from Trellis. And having Trellis use a fork of an existing open source library (be it commons-rdf or jena) is a non-starter for reasons that should be obvious.

If that is an issue for the trellis-client, I would recommend that the client re-implement its own IOService using an appropriate RDF serialization backend.

christopher-johnson commented 6 years ago

That is disappointing. I could accept this if jena-osgi were used only in an OSGI environment, but it is not. JPMS, not OSGI, is the future, I am totally convinced. The JPMS build of Jena is totally possible and nearly within reach. The "fork" could actually be a major release, if there were a consensus that it is a good decision to make the break. The main work is with the Jena testing apparatus, which is in need of an update and general restructuring. You could actually do great work by helping Andy sort it out.

I guess what is obvious is relative. What is obvious to me is that implementing a brand new "long term archival" system using legacy APIs is a bad decision. Trellis is state of the art and you are a great programmer with a lot of insight. Trellis client is nothing compared to what you have done. My main concern remains that I need a system that is modular and provides SPARQL. This is the major value of Jena, and it took years of work to achieve the current functionality. I cannot just reinvent a new IOService. The path that makes the most sense to me is to make Jena work with my requirements, and this requires minor structural (not logical!) modification. I cannot see it any other way.

ajs6f commented 6 years ago

Regarding Jena:

The "fork" could actually be a major release, if there were a consensus that it is a good decision to make the break.

There is no such consensus, there is currently no progress towards such consensus, and there is a long way to go before Jena gets anywhere near such consensus. I would suggest you take this up on the dev@jena list. You could start by continuing the conversation with Andy and responding to some of his advice. (Hopefully, as he asks, with some working code that we can all discuss.)

Jena is old, and popular (well, popular for anything having to do with RDF, anyway) and has a (relatively) large install base. It's not going to make sudden sharp turns. It can make grand changes in direction, but that doesn't happen without a lot of preparatory work.

christopher-johnson commented 6 years ago

Right. I will do tomorrow. Today, I worked on the test compile in JPMS. Without the module definitions, everything compiles and the tests run (in jdk 10) but there are around 200 test failures Excluding the tests, the libs can be built. With the modules, I can compile and test compile iri, base and core, but not arq. There are tests that extend TestCase in the Junit3 style with parameter constructors and this does not work with the JUnitPlatform runner. I did some research on Parameter Resolution with JUnit5 extensions that could work for TestItem. The EarlReport might extend TestReport as well. The whole scenario requires a global overview and I am not qualified to grasp the nature of all the tests yet. I presume that the tests should stay with their packages, but perhaps a new test module (ala trellis-test) could contain the abstractions. JPMS also does not allow test visibility between modules, which makes sense to me. But this breaks some of the "BaseTest" inheritance patterns in Jena.

At any rate, I will respond to the list tomorrow even if it is just a rough first draft. The other possibly contentious aspect is adopting Gradle. I do not even know if there is a module builder for Maven yet... and Maven currently lacks the necessary dep management functionality (and maven test does not work in jdk 9+).

Here are a couple of JPMS related references that are interesting: https://mreinhold.org/blog/jigsaw-complete http://openjdk.java.net/projects/jigsaw/spec/sotms/

One other item of interest is the "expert group" in the jigsaw spec. Neil Bartlett (Paremus) Wayne Beaton (Eclipse) Hans Dockter (Gradleware) Tim Ellison (IBM) Rémi Forax Bob Lee David Lloyd (Red Hat) Mark Reinhold (Oracle) Robert Scholte

Neil Bartlett "wrote the book" on OSGI...

acoburn commented 6 years ago

The main reason I chose Java as a platform for implementing Trellis has to do with its mature ecosystem and wide adoption. This provide a huge degree of stability and reliability. So when a significant change comes along -- in this case JPMS -- it takes time for all of that to catch up, whether it is the build system, the testing framework or the explicit dependencies. I actually see this as a good thing.

I agree that JPMS is where Java is heading, but it will take time before the Java ecosystem catches up. And much of this change will need to start with widely used libraries, such as what are produced by Apache Commons, Guava, JUnit, etc. It will be a deliberate, step-by-step evolution. It will take time, but that change is happening. From what I have seen in the development priorities of these core libraries (e.g. commons-lang3, guava, jackson), the first step consists in modularizing their own code base and generating Automatic-Module-Name metadata. That allows the code to exist in a pre-JDK9 environment while also playing by the rules of a JDK9+ environment. I am pretty confident that this transition period will be a long transition period.

From the perspective of Trellis, the code is already fairly well modularized and AMN metadata is already present. So the main thing that needs to happen from my perspective is that we need to be patient while the core dependencies move in that same direction. It will happen; it won't happen overnight, and for some code bases, it may require a considerable amount of work.

For Trellis, I am very interested in the "long game", and much of that involves following best practices and established design patterns. It is a non-goal of Trellis to play a role in establishing these patterns, and so I am reluctant to be the first one to dive into the latest JDK features.

In a word: JPMS will happen in Trellis, but it may be some (considerable) time before it is fully integrated.

christopher-johnson commented 6 years ago

here is a dependency graph of trellis-app-triplestore for interest: trellis-app-triplestore-graph

BTW, I am done working on JPMS for a while and I have learned a lot with this exercise. You are right that it is a question of patience and time. But, a JPMS application does not have to wait for every dependency to be modularized, since even a normal jar can be used as a module (if it has a standard package structure). It is just unfortunate for trellis-app-triplestore that Jena is not ready yet.

acoburn commented 5 years ago

@christopher-johnson you may be interested in branch-trellis-66, which implements JPMS-modularity in Trellis. The key thing here is that the code in that branch still builds in a non-JPMS context: one must supply a -Pjpms flag to build the code as modules. The other key thing is that I don't repackage any of the upstream dependencies.

I would note that I will still be releasing Trellis artifacts as Java 8-compatible for the foreseeable future, but this at least gets us closer to having Java modules in Trellis. The other item I'd mention is that I'm not using any plugin for this, which is mostly because a) I wanted complete control over the build commands and b) I wanted to better understand what was going on "under the hood". That said, I'm not against using a plugin for this, provided that the plugin is maintained.

I would also add that your jpms branch was very useful in my work on this.

acoburn commented 5 years ago

Update: all of the "modularizable" subprojects have been modularized. This excludes:

Everything else can now be built as a java 11 module with the -Pjpms flag.

christopher-johnson commented 5 years ago

nice work. only noticed a couple of issues. it looks like qpid-broker is split, so I had to exclude trellis-amqp from the build. also, I had to do a global exclusion for javax.inject to avoid an ambiguous module reference error in my IDE. everything else compiles and tests fine.

acoburn commented 5 years ago

Yes, the qpid-broker has a split-package issue. I thought I handled that with these acrobatics, but maybe there's a better way to do that.

christopher-johnson commented 5 years ago

btw, I also see that there are several "Package 'x is declared in module 'x', but module 'org.trellisldp...' does not read it" errors reported in my IDE. If I remember, these missing declarations in the module-info.java may cause issues at runtime, that might not be noticeable in the tests.

acoburn commented 5 years ago

I've just been using the output from gradle, so if your IDE has extra information about additional modules that should be declared, feel free to make a PR targeting branch-trellis-66. Or if you could add that information in a comment, that would be great.