CompEvol / beast2

Bayesian Evolutionary Analysis by Sampling Trees
www.beast2.org
GNU Lesser General Public License v2.1
241 stars 84 forks source link

Package management broken #948

Open rbouckaert opened 3 years ago

rbouckaert commented 3 years ago

packagemanager -add ORC gives a javax.net.ssl.SSLHandshakeException.

The package manager in BEAUti BEAUti still lists all packages, but when you try to install a message saying:

"Install failed because: java.security.cert.CertificateException: No subject alternative DNS name matching github-releases githubusercontent.com found"

rbouckaert commented 3 years ago

It looks like github releases are redirected in such a way that some security condition is not met when the package manager attempts to download a package. Moving the package files elsewhere (e.g. as part of a github repository instead of the release area) allows downloading as before. So, a workaround for existing releases is to create a package repository containing the releases and have CBAN point to those zip files instead of to the github release areas. I created a repo for this purpose here: https://github.com/CompEvol/packages

walterxie commented 3 years ago

I suggest to consider Maven, so you can publish packages to the Maven Central Repository. But there will be a lot of works to migrate package manager to Maven.

https://docs.github.com/en/actions/guides/publishing-java-packages-with-maven

https://stackoverflow.com/questions/15530886/mvn-install-or-mvn-package

tgvaughan commented 3 years ago

How widespread is this? I'm not seeing the error (openjdk 15, adoptopenjdk build on macos mojave). Are we sure this isn't just a missing ca cert or something?

rbouckaert commented 3 years ago

@tgvaughan It stopped working under OS X Sierra (admittedly quite old) java 1.8.0_37, but also with an up to date Ubuntu Linux 18.04 with openjdk 1.8.0_275, where previously (weeks ago?) there was no problem installing packages via the package manager.

By the way, I copied all the latest packages hosted on github release areas to a separate github repo (CompEvol/packages) and pointed CBAN to there, so it should not be a problem any more. Did you test this on a package that still is in a github release area?

It looks like a security issue due to some changes at how github hosts releases. Running curl on the original URLs in CBAN allows you to trace where it is redirected, but I am not sure where things go wrong just yet.

tgvaughan commented 3 years ago

@rbouckaert argh, stupid me. Yes, you're right - I tried using a package belonging to a 3rd party repo and it failed as you described. Sorry for the noise.

tgvaughan commented 3 years ago

Comparing the verbose output of curl (with --location --verbose) with the output generated by PackageManager with -Djavax.net.debug=all, it seems as though the server certificate the java methods are retrieving following the redirect is different from the one that curl is retrieving. In particular, the one curl gets has an expiry date of April 14 2022 and contains the SAN *.githubusercontent.com, while the one java retrieves has an expiry of Nov 10 2021 and does not contain the required SAN. Very strange!

Disabling hostname verification entirely by adding

HttpsURLConnection.setDefaultHostnameVerifier((hostname, session) -> true);

to PackageManager.installPackages() (or, equivalently, doing the same thing on the HttpsURLConnection instance) causes the problems to go away, of course, but I doubt this is something we want to do.

tgvaughan commented 3 years ago

@rbouckaert the problem is line 161 of PackageManager.java in the getRepositoryURLs() method:

        // Java 7 introduced SNI support which is enabled by default.
        // http://stackoverflow.com/questions/7615645/ssl-handshake-alert-unrecognized-name-error-since-upgrade-to-java-1-7-0
        System.setProperty("jsse.enableSNIExtension", "false");

It must have been the case at some point that the server hosting the package list was misconfigured, or that Java's TLS implementation was somehow broken, because this shouldn't be necessary. Further, SNI is sometimes actually required in order to properly verify the hostname, which is why things are breaking for us now. (Github is providing the wrong certificate because we're not indicating using SNI which hostname we're attempting to connect to.)

Removing this line (or setting the property to "true") fixes the problem.

rbouckaert commented 3 years ago

@tgvaughan Thanks for diving into this Tim! Unfortunately, it looks like regardless what we do, it still leaves us at the mercy of the platform that hosts releases. Perhaps having a backup server that makes daily copies of latest releases where the package manager redirects to in case the default fails may be an option. Anyway, I am prepping the next release and this definitely should go in there. But v2.6.3 and earlier releases won't be able to access github packages released in the release area.

tgvaughan commented 3 years ago

Can we perhaps again look at directly hosting the packages on our own server, similar to CRAN? This would (a) give us control over these things in future, (b) allow us to ensure old versions of packages remain available, and (c) make backing up everything a cinch. I honestly don't think it'd be too hard to do, and might save us a lot of work going forward.

rbouckaert commented 3 years ago

I like that idea, but would like to see some redundancy built in: if everything would be hosted on a single server and the server goes down then having a second place to go to might save the day (though the same argument applies to the packages repository files, so far they have proven to be hosted robustly). Up to now, github hosting of releases was no problem, but since it now fails having a second URL to go to would have been good -- perhaps as second URL in package repo files. One benefit of github is that it is fast world-wide, while my experience accessing sites from Auckland that are privately hosted in for example Europe is not that good. Anyway, do you have concrete suggestions for hosting such site? I can inquire with UoA what they can offer.

tgvaughan commented 3 years ago

No concrete suggestions, but well-known VPS providers (eg digital ocean) have extremely good uptimes. And it's possible we could speed up retrieval of packages.xml by sticking cloudflare in front and asking it to cache this file specifically. (The downside of this is that updated packages won't be seen until the cache is either automatically or manually invalidated.)

In terms of redundancy, we could have a weekly back up of the package list and binaries on beast2.org.

tgvaughan commented 3 years ago

It seems lots of people are still running up against this issue, even with updated beast versions. I investigated one such case in the lab here, and it turns out that if your original BEAST installation was <=v2.6.3, you'll still encounter the issue even if the beast core package is current. This leads to confusing situations, where beast -version reports v2.6.6 but downloading packages from github still doesn't work.

TL;DR 2.6.4 doesn't solve this issue unless the BEAST 2 application is downloaded and installed from scratch.

rbouckaert commented 3 years ago

This should only be a problem with packages that are hosted on github in the release area. Since most packages should be moved to a more robust place I was hoping this would not be a problem any more, even for <=v2.6.3. Can you remember which package was causing this problem? If so, we can copy that package to a more robust place as well.

tgvaughan commented 3 years ago

I use third-party repositories a lot to distribute unpublished packages which are thus not allowed in the main repository. (This is an edge case, sure.) And sure, moving these out of github releases is a workaround, but I think it's important to remember that the problem here is not with any lack of robustness on github's side, but with beast disabling SNI. We're just lucky that, for the moment at least, checked-in binaries aren't served in a way that requires SNI to identify the name of the server. (Which can happen whenever one physical server is responsible for multiple virtual servers.)

I just wanted to point out to anybody coming across this problem that the commits added to address this issue don't address the issue at all unless beast is completely reinstalled.