GateNLP / gate-core

The GATE Embedded core API and GATE Developer application
GNU Lesser General Public License v3.0
76 stars 29 forks source link

Export for cloud can still produce a zip that won't work when fully offline #55

Closed ianroberts closed 6 years ago

ianroberts commented 6 years ago

When different paths through the dependency tree lead to different versions of the same transitive dependency, aether requires access to all the candidate version POM files at resolution time, though only the JAR of the finally selected version. The SimpleMavenCache mechanism only caches the final selected version. Therefore it is possible to have a situation where loading the exported app requires files that are not in the maven-cache.gate, and thus loading fails when there is no access to Maven Central.

The specific example I tripped on is the jdbclookup plugin (0.3-SNAPSHOT, as used by Bio-YODIE), which has a direct dependency on eclipse-collections version 8.0.1, but it also depends on mapdb version 3.0.2 which in turn has a version range dependency on eclipse-collections-forkjoin version [7.0.0,7.20.0) (and the selected version of that in turn depends on the matching eclipse-collections). For some reason Aether chooses version 7.1.2 of eclipse-collections to actually load (even though we directly depend on 8.0.1!) and this is the version that gets packed into the zip file when you export for cloud. But if you then try and load the app on a disconnected machine you get

org.eclipse.aether.collection.DependencyCollectionException: Failed to collect dependencies at uk.ac.gate.plugins:jdbclookup:jar:0.3-SNAPSHOT -> org.eclipse.collections:eclipse-collections:jar:8.0.1
...
Caused by: org.eclipse.aether.resolution.ArtifactResolutionException: Could not transfer artifact org.eclipse.collections:eclipse-collections:pom:8.0.1 from/to central (http://repo1.maven.org/maven2/): Connection refused (Connection refused)

i.e. it is still trying to resolve 8.0.1 from central. In fact, in this particular case it's worse because of the version range, and if I load the app on a connected machine with an empty ~/.m2/repository I find it has had to download from central all the POMs for 7.0.0, 7.0.1, 7.0.2, 7.0.3, 7.1.0, 7.1.1 and 8.0.1 (plus 7.1.2 which is in the maven-cache.gate).

I'm not honestly sure what the best approach would be here, it's kind of like instead of just cacheing the files we know we need, we really need to cache every file aether touches during the resolution process. But we can't simply re-run resolution with a different local repository folder because then you wouldn't be able to cache JARs you had built locally (which only exist in your ~/.m2 and not in any remote repo), unless there's a way to treat ~/.m2/repository as if it were a remote repo for the purpose of this re-resolve.