Closed ethanhkim closed 1 month ago
I think it's a change to log4j since this was originally written (e.g. similar to https://stackoverflow.com/questions/66086690). We'll get on it. Apologies as there is likely to be a short delay.
We should also double-check that Java compilation is checked in the workflow (inc. for Docker), though I thought it was already. Perhaps we have a Java version pinned before this change.
Thanks for the quick response! Really appreciate it.
Our Docker image (based on python:3.8-slim-buster ie Debian 10) uses GATE 8.6.1. GATE 8.6.1 ships with log4j-1.2.17.jar. The Dockerfile runs crate_nlp_build_gate_java_interface
and we are testing this in at least one of our workflows.
GATE 9.0.1 appears to be the latest version (albeit from March 2021) and looking at the bundled libraries, log4j has been replaced with log4j-over-slf4j. My hunch is that with log4j gone from the GATE lib directory, the build script is trying to use whatever version it can find on the system, which for @ethanhkim is a 2.x version with the API change.
Short term fix is to use GATE 8.6.1 or CRATE running under Docker https://crateanon.readthedocs.io/en/latest/installation/docker.html.
In the longer term, we shouldn't be using log4j 1.x as it isn't supported any more.
We could drop support for all but the latest release of GATE (9.0.1). This version has 46 vulnerabilities compare to the 115 vulnerabilities in 8.6.1.
We could make CrateGatePipeline.java work with both versions of GATE by making sure the correct version of whatever dependencies we use are pulled in (possibly with Maven or similar).
Thanks for the detailed response! I'll give installing GATE 8.6.1 a go and see if the issue resolves in the short term.
I guess it'd be good to separate whatever GATE wants from what we want, e.g. by pinning a log4j (or similar module) version. Our code imports modules from core Java, log4j, and GATE. I'm not sure whether this can all co-exist happily, if e.g. we specified a URL to the class loader (old example at https://stackoverflow.com/questions/6105124/); this (https://boyl.es/post/two-versions-same-library/) suggests Java doesn't support loading two versions of one class (e.g. if we loaded one logger but GATE wants another), but can Maven get round this problem (same link)?
I think we can fix this by moving the log4j configuration to a file. This should remove the incompatible code. With GATE 9.x it appears that the calls to log4j will be routed through sl4fj to Logback so if we provide an equivalent Logback configuration, it should all work. I'm trying this out on the later-gate-dev branch.
Hello,
I am trying to build the GATE interface through
crate_nlp_build_gate_java_interface
and I am consistently running into this error message:Currently running Python 3.9.18, crate-anon 0.20.3, Java 1.8.0.402, GATE v9.0 on AlmaLinux Release 9.3. Any help would be greatly appreciated!