matsim-org / matsim-libs

Multi-Agent Transport Simulation
www.matsim.org
461 stars 436 forks source link

update to geotools 31 (resolve transitive dependency conflicts) #3272

Closed nkuehnel closed 1 month ago

nkuehnel commented 1 month ago

This PR updates the geotools version to 31.0

and sharing a little post mortem of a few lost hours on our side :)

It took @mfrawley-moia and me quite a while to pinpoint the cause of an issue where our CI on github wouldn't run through with the following exception:

2024-05-15T16:10:31,111  INFO ControlerListenerManagerImpl:269 calling notifyAfterMobsim on org.matsim.core.scoring.EventsToActivities$1 with priority 0.0
2024-05-15T16:10:31,120  INFO ControlerListenerManagerImpl:272 [it.0] all ControlerAfterMobsimListeners called.
2024-05-15T16:10:31,120 ERROR MatsimRuntimeModifications:76 Getting uncaught Exception in Thread main
java.lang.NoClassDefFoundError: org/apache/commons/io/function/IOStream
2024-05-15T16:10:31,124  INFO MatsimRuntimeModifications:80 S H U T D O W N   ---   start shutdown.
    at org.apache.commons.csv.CSVPrinter.printRecord(CSVPrinter.java:267) ~[commons-csv-1.11.0.jar:1.11.0]
    at org.apache.commons.csv.CSVPrinter.printRecord(CSVPrinter.java:285) ~[commons-csv-1.11.0.jar:1.11.0]
    at org.apache.commons.csv.CSVPrinter.<init>(CSVPrinter.java:117) ~[commons-csv-1.11.0.jar:1.11.0]
    at org.matsim.contrib.ev.stats.EnergyConsumptionCollector.notifyMobsimBeforeCleanup(EnergyConsumptionCollector.java:70) ~[ev-moia-latest-SNAPSHOT.jar:?]
    at org.matsim.core.mobsim.qsim.MobsimListenerManager.fireQueueSimulationBeforeCleanupEvent(MobsimListenerManager.java:99) ~[matsim-moia-latest-SNAPSHOT.jar:?]
    at org.matsim.core.mobsim.qsim.QSim.cleanupSim(QSim.java:351) ~[matsim-moia-latest-SNAPSHOT.jar:?] 

Of course all tests in matsim-org and our local versions of the repo went through :D It seemed like the CSVPrinter in the org.apache.commons-commons-csv dependency cannot find the IOStream import. This import statement was added in a recent release of commons-csv that was updated in MATSim by dependabot in https://github.com/matsim-org/matsim-libs/commit/61c6a35e6a2da8dd92681012e101034aa91c87e4 a few weeks ago.

The IOStream itself lives in the commons-io:commons-io package, with the latest release of 2.16.1, which is correctly referred to by the latest commons-csv dependency.

However - after a while we figured that multiple versions of commons-io were present in our project. After a while we looked at the output of mvn dependey:tree -verbose and saw that there were at least 3 or 4 conflicting versions of commons-io, most of them defined in the geotools sphere, which still pointed to version 2.10 instead 2.16.1.

So, here we are, geotools 31.0 correctly points to 2.16.1 of commons-io. Not sure how we could prevent these annoying issues in the future...?

No wonder I'm going bald!

kt86 commented 1 month ago

I had the same issue in an sub-repo) a few days ago. There it was solvable by manually setting the version for commons-ioas follows: `

commons-io commons-io 2.16.1
</dependencyManagement>

`

This helped, so maven choose the "correct" version.

Since this is already part of the MATSim's pom.xml: https://github.com/matsim-org/matsim-libs/blob/1a75b4d9995a32bc67ad671c333e156c7c214b57/pom.xml#L138-L142

I was able to avoid this issue, by declaring MATSim as "parent" project in the projects pom.xml (and remove there the lines mentioned above)

nkuehnel commented 1 month ago

Yea that's as a quick fix we also just set the version directly in our project. But I guess we should also at least try to fix it upstream :)

kt86 commented 1 month ago

The usual problem one has with the huge number of (sub) dependencies, and then it is a bit unclear, for which maven decides if not said explicitly.

nkuehnel commented 1 month ago

Yes it doesn't seem very transparent which version maven picks in the end...

kt86 commented 1 month ago

You are right, @nkuehnel . Have you looked, whether this is the only dependency using this old version? As far as I have seen there where still some other dependencies also referring to commons-io somewhere in their dependency graph.

As said before: For me personally, it seems the best way to use MATSim as parent because then one has all these settings in line with MATSim. (And at least dependabot will take care of a lot of them.) And in this case, here it is well-defined in matsim-libs' pom :)

nkuehnel commented 1 month ago

Not sure if the parent dependency thing would work for us as we deploy matsim as a library on github packages and then use it in our private project through maven

kt86 commented 1 month ago

Nico, I wrote you an Email to avoid spamming here :)

nkuehnel commented 1 month ago

There were quite a few api changes in version 30.x https://docs.geotools.org/stable/userguide/welcome/upgrade.html#update30

it's mostly updating import statements and the gt-opengis module is discontinued https://docs.geotools.org/stable/userguide/library/api/faq.html

nkuehnel commented 1 month ago

For any downstream projects, there is a transition script available to deal with the imports:


The package org.opengis has changed org.geotools.api.

To aid in this transition an [Apache Ant](https://ant.apache.org/) script is provided:

Download Ant script [remove-opengis.xml](https://docs.geotools.org/stable/userguide/_downloads/91205c1a7cfd0822e9c33ee986f77987/remove-opengis.xml) included with this documentation.

The default update target will update your Java code and META-INF/services:

Use the absolute path for project.dir:

ant -f remove-opengis.xml -Dproject.dir=(absolute path to your project directory)
Or copy remove-opengis.xml file into your project directory and run:

ant -f remove-opengis.xml
We have done our best to with this update script but it is not perfect - known issues:

Double check use of geometry Position and temporal Position which have the same classname in different packages

Clean up unused or duplicate imports

You may need to re-run code formatters
marecabo commented 1 month ago

Dependency issues can be real pain, thanks a lot for going through it till the end and the great write-up of what happened and how to fix it properly!

LGTM! 🚀

michalmac commented 1 month ago

My suggestion would be to specify dependency versions in dependencyManagement in the top pom.xml. Dependabot can read them and will try to keep them up to date. In addition, we require maven enforcer to check for the upper bound dependencies, which reduces chances of running into such issues: https://github.com/matsim-org/matsim-libs/blob/21a4ff0a9fe373969de91aaa28b510573e4aeae0/pom.xml#L388C25-L388C49

Otherwise, we would need to specify all dependencies for each contrib, because even if we specify all dependencies for one contrib, another contrib that depends on this one may run into similar issues.

nkuehnel commented 1 month ago

@michalmac geotools was fixed to 29.5 and commons-io was fixed to 2.16.1 in the dependency management section. They are conflicting in terms of the commons-io package... not sure if dependabout should have done something there?

michalmac commented 1 month ago

@michalmac geotools was fixed to 29.5 and commons-io was fixed to 2.16.1 in the dependency management section. They are conflicting in terms of the commons-io package... not sure if dependabout should have done something there?

Ah, I see. The problem was not the dependency version resolution (this is how I interpreted the title of this PR), but the fact that we have two dependencies A and B and each of them requires a different version of dependency C. The main problem is that the dependency C introduced some backward incompatibilities, so pointing to the latest version of C does not work.

I do not think there is a clean solution to resolve this issue. Here's one option: https://boyl.es/post/two-versions-same-library/. AFAIK Java platform module system offers versioning of modules, but I do not think there is a way to point to specific version of the module from another module. (I have never used JPMS in practice, therefore I am a bit guessing here).