MontrealCorpusTools / PolyglotDB

Language data store and linguistic query API
MIT License
38 stars 14 forks source link

Compatibility with recent OpenJDK & Neo4j versions #185

Closed james-tanner closed 2 months ago

james-tanner commented 6 months ago

Hi Michael,

I've just been going through the process of installing Polyglot on a new server, and I think some issues with outdated dependencies are making installation/maintenance quite difficult.

OpenJDK: the install guide requires the installation of sudo apt-get install openjdk-11-jdk-headless which is no longer actively maintained or supported by Oracle -- attempts to install this via the openjdk repository is a deadend:

sudo add-apt-repository ppa:webupd8team/java

The Oracle JDK License has changed for releases starting April 16, 2019.

The new Oracle Technology Network License Agreement for Oracle Java SE is substantially different from prior Oracle JDK licenses. The new license permits certain uses, such as personal use and development use, at no cost -- but other uses authorized under prior Oracle JDK licenses may no longer be available. Please review the terms carefully before downloading and using this product. An FAQ is available here: https://www.oracle.com/technetwork/java/javase/overview/oracle-jdk-faqs.html

Oracle Java downloads now require logging in to an Oracle account to download Java updates, like the latest Oracle Java 8u211 / Java SE 8u212. Because of this I cannot update the PPA with the latest Java (and the old links were broken by Oracle).

For this reason, THIS PPA IS DISCONTINUED.

It's possible to still find openjdk-11 and install manually, but I'm not sure it's something that normal users/admins should be doing.

Why not just use a modern OpenJDK (e.g. 17+)? The version of Neo4j used in polyglot has a conflict with any version of OpenJDK older than 11, which results in a failure of the pdgb server to start:

2024-03-22 15:57:43.900+0000 INFO  Starting...
Exception in thread "main" java.lang.LinkageError: Cannot to link java.nio.DirectByteBuffer
        at org.neo4j.internal.unsafe.UnsafeUtil.<clinit>(UnsafeUtil.java:124)
        at org.neo4j.memory.RuntimeInternals.guessHeaderSize(RuntimeInternals.java:158)
        at org.neo4j.memory.RuntimeInternals.<clinit>(RuntimeInternals.java:53)
        at org.neo4j.memory.HeapEstimator.<clinit>(HeapEstimator.java:103)
        at org.neo4j.internal.collector.RecentQueryBuffer.<clinit>(RecentQueryBuffer.java:37)
        at org.neo4j.graphdb.factory.module.GlobalModule.<init>(GlobalModule.java:211)
        at org.neo4j.graphdb.facade.DatabaseManagementServiceFactory.createGlobalModule(DatabaseManagementServiceFactory.java:259)
        at org.neo4j.graphdb.facade.DatabaseManagementServiceFactory.build(DatabaseManagementServiceFactory.java:126)
        at org.neo4j.server.CommunityBootstrapper.createNeo(CommunityBootstrapper.java:36)
        at org.neo4j.server.NeoBootstrapper.start(NeoBootstrapper.java:134)
        at org.neo4j.server.NeoBootstrapper.start(NeoBootstrapper.java:90)
        at org.neo4j.server.CommunityEntryPoint.main(CommunityEntryPoint.java:34)
Caused by: java.lang.IllegalAccessException: module java.base does not open java.nio to unnamed module @11b03c1f
        at java.base/java.lang.invoke.MethodHandles.privateLookupIn(MethodHandles.java:259)
        at org.neo4j.internal.unsafe.UnsafeUtil.<clinit>(UnsafeUtil.java:107)
        ... 11 more
2024-03-22 15:57:44.097+0000 INFO  Neo4j Server shutdown initiated by request
2024-03-22 15:57:44.098+0000 INFO  Stopped.

This could be solved by using a newer version of Neo4j (e.g. 5+) -- however this also results in an Invalid Constraint Syntax from Neo4j when importing, the same as in this thread which points to this being due to discrepancies between 4.x and 5+ versions.

This leads me to think about whether it would be necessary to update polyglot to use currently-supported versions of its dependencies. As noted in my PR, the version of Neo4j used in the pdgb installation also installs a vulnerable version of log4j. I'm not sure exactly how simple/trivial this would be (I guess involving the update of cyphers within pdgb), but it might be something I can help with if you think this is a good idea/worth doing.

Thanks!

James