librariesio / libraries.io

:books: The Open Source Discovery Service
https://libraries.io
GNU Affero General Public License v3.0
1.1k stars 206 forks source link

MavenCentral: if latest_version returns ${}, scrape the HTML for the package to find the latest_version instead. #3363

Closed tiegz closed 2 months ago

tiegz commented 2 months ago

when we update Maven packages, the PackageManager::Maven.project() method looks for the latest version of the package in order to fetch the correct POM.

when fetching that latest version, sometimes a Maven Central package's maven-metadata.xml may contain an interpolation string for the version, e.g. ${revision} in https://repo1.maven.org/maven2/io/github/caffetteria/data-service-opencmis/maven-metadata.xml .

Libraries has no general way of resolving this to fetch the revision variable. When we run into this case, we can scrape the HTML file listing page for the package instead, and pick the highest version folder to use as latest_version instead.