wch / webshot

Take screenshots of web pages from R
http://wch.github.io/webshot/
228 stars 40 forks source link

Set OPENSSL_CONF to use the config from OpenSSL 1.1.1 if the OpenSSL … #93

Open gterziysky opened 4 years ago

gterziysky commented 4 years ago

…version is higher than 1.1.1 to fix PhantomJS

wch commented 4 years ago

This doesn't compile on my Mac. Is it necessary to include compiled code?

clang++ -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG  -I"/Users/winston/R/3.6/Rcpp/include" -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -I/usr/local/include  -fPIC  -Wall -g -O2  -g -O0 -Wall -std=gnu++0x -c get_ssl_version.cpp -o get_ssl_version.o
get_ssl_version.cpp:2:10: fatal error: 'openssl/opensslv.h' file not found
#include <openssl/opensslv.h>
         ^~~~~~~~~~~~~~~~~~~~
1 error generated.
make: *** [get_ssl_version.o] Error 1
make: *** Waiting for unfinished jobs....
ERROR: compilation failed for package ‘webshot’
* removing ‘/Users/winston/R/3.6/webshot’
* restoring previous ‘/Users/winston/R/3.6/webshot’

Also, regarding your comment in https://github.com/wch/webshot/issues/90#issuecomment-594640607, I don't think it makes sense to run PhantomJS with the new config on all platforms, if it's not needed. That just increases the risk of problems.

Just a note: I'm occupied with other projects these days and don't have much time to test this stuff out on all relevant platforms, so it will be hard for me to evaluate big changes like this.

gterziysky commented 4 years ago

The error arises because the build needs openssl headers to be installed on the machine and visible to the linker. (Could be fixed by modifying the appveyor.yml to pre-install openssl for Mac OS and Windows.)

The change which breaks PhantomJS is the removal of RANDFILE from the config. It is easily visible in this commit.

The workaround is:

  1. add the OpenSSL 1.1.1 config to /inst, i.e. before RANDFILE was removed
  2. in phantom_run(), check if the current OPENSSL_VERSION_NUMBER is higher than the version number for OpenSSL 1.1.1 and only in this case use the config in 1.

Alternatively, I could avoid using compiled code if I used openssl::openssl_config()$version to get the OPENSSL_VERSION_TEXT. However, version comparison then becomes harder as the variable is a string (1.1.1-preX vs 1.1.1d vs 1.1.1). OPENSSL_VERSION_NUMBER on the other hand is a strictly increasing integer and easier to compare.

wch commented 4 years ago

I realize that one can install the OpenSSL headers on their own system to get this to compile. The issue is that most people don't have them installed, and so this won't build on their systems.

Most people on Mac and Windows install binary packages from CRAN, so they wouldn't be directly affected by this. However, I don't believe that CRAN's Windows and Mac build systems have those headers installed, because I have experience with other packages that use OpenSSL and have to download the OpenSSL sources as part of the install process. (It's possible I'm wrong about the CRAN build machines, though.)

gterziysky commented 4 years ago

I do agree that adding external compiled dependencies introduces more complexity, hence my proposal from above:

Alternatively, I could avoid using compiled code if I used openssl::openssl_config()$version to get the OPENSSL_VERSION_TEXT.

If you are fine with that, I can close this PR and open up a new one which does not use compiled code: https://github.com/gterziysky/webshot/commit/9317e77bdef7b96d2c7a9dfd2e378a2246920e23.

As for the CRAN build machines I am not sure either. However, openssl also depends on OpenSSL (see openssl/src/info.c) and it is on CRAN already.

wch commented 4 years ago

Note that the openssl package:

On Unix-like platforms, the configure script for openssl takes the src/Makevars.in file, and generates src/Makevars. On Windows, the src/Makevars.win file is used.

We basically used the same strategy for the websocket package.

One thing I wonder about using openssl::openssl_config() is this: can we be certain that it's the same copy of OpenSSL that's used by PhantomJS? In the case of Windows, I'm pretty sure that it is not. On the other platforms, it might be, but I'm not sure. I'm also a bit wary of requiring the openssl package, because it brings in the same system dependency issues discussed above, in that it requires the user to install a system-level package just to install webshot, and this might not be possible for all users. The one plus of using the openssl package is that the difficult stuff with configure scripts and Makevars.in/Makevars.win has already been solved in that package.

wch commented 4 years ago

Just to clarify things, my concern is this: most people currently can just do install.packages('webshot') and have it work. I'm reluctant to take a change that will make this no longer possible. Or, when they run update.packages(), they find that they can't upgrade webshot. If fixing the issue for Debian Buster results in breakage for all these users, that's not great.

Can you just check the version by running openssl version? Bear in mind that we need to be sure that the copy of openssl that is referenced with the openssl command is the same one that PhantomJS is trying to use.

gterziysky commented 4 years ago

This is for anyone who encounters the issue and also to say why the openssl.cnf should be edited instead of introducing a workaround in your package. @wch , feel free to close this PR.

Exact cause of the issue

It appears the reason is in the following patch to openssl on Debian: https://salsa.debian.org/debian/openssl/-/commit/67b9ae2850a4207addf920250794aaa20011ab97, which enforces TLSv1.2 as the minimum protocol.

PhantomJS is compiled statically against OpenSSL 1.0.2e (more details on this below), but the OpenSSL lib still tries to load the openssl.cnf from its default location (for windows and other OSes) at runtime (which is good because it allows the users to choose their own settings). Unfortunately, the OpenSSL built with PhantomJS is too old and cannot work with the new config.

OpenSSL version used by PhantomJS 2.1.1

PhantomJS 2.1.1 actually compiles against a static version of the QT libs which in turn compile against a static version of the OpenSSL libs (https://github.com/ariya/phantomjs/blob/2.1.1/build.py#L161).

For the Windows build, OpenSSL version 1.0.2e is used (see here https://github.com/vitallium/phantomjs-3rdparty-win/blob/19051aa97cecdcd3ef8c8862e36a3cb4cd3471fc/openssl/include/openssl/opensslv.h#L33). For Mac OS and other Unix based OSes, it depends on the OpenSSL version which was installed on the machine used to build PhantomJS. My guess is that those builds should also use OpenSSL 1.0.2e, but there is no easy way to tell as (AFAICT) PhantomJS does not expose an API to the OpenSSL version it uses.

It boils down to the OpenSSL config

Just to clarify things, my concern is this: most people currently can just do install.packages('webshot') and have it work.

Indeed, the openssl.cnf (supplied with OpenSSL 1.1.1d which is the latest stable version at the time of writing) I have on my Ubuntu machine does not enforce TLSv1.2 as a minimum and thus PhantomJS runs just fine. That said, I can easily reproduce the error by simply applying the changes from the patch.

As for the PR

Bear in mind that we need to be sure that the copy of openssl that is referenced with the openssl command is the same one that PhantomJS is trying to use.

To clarify, given that PhantomJS is not actively maintained anymore and it uses OpenSSL 1.0.2e, it is unrealistic to expect that the user's system openssl will also be at the same version.

I think that it makes more sense to change the openssl.cnf locally than to provide any workarounds in your package. It is likely that a future version of OpenSSL will alter the default config beyond what PhantomJS 2.1.1 (the last release before project maintenance ended) can work with, but it should be up to the end users to decide what they should do in that case:

That said, you could add a warning just before calling the phantom process to let the user know:

warnAboutNewerSSLConf <- function() {
  ssl_executable <- Sys.which("openssl")
  if (ssl_executable == "") {
    # check SSL default installation dirs to determine the path to the ssl binary/exe:
    # Mac OS: see phantomjs build script: https://github.com/ariya/phantomjs/blob/2.1.1/build.py#L216
    # Win: https://github.com/openssl/openssl/blob/c08dea30d4d127412097b39d9974ba6090041a7c/NOTES.WIN
    # Unix: https://github.com/openssl/openssl/blob/c08dea30d4d127412097b39d9974ba6090041a7c/NOTES.UNIX
    ssl_executable <- "..."
  }

  ssl_ver <- system(sprintf("%s version -v", ssl_executable), intern = T)
  if (grepl(pattern = "^OpenSSL", x = ssl_ver, perl = T)) {
    match <- regexpr(pattern = "\\d(\\.\\d){2}", text = ssl_ver, perl = T)
    ssl_ver_num <- substring(ssl_ver, match, match + attr(match, "match.length") - 1)
    if (compareVersion(ssl_ver_num, "1.0.2") == 1L) warning(sprintf("Your current OpenSSL version (%s) is higher than the one used to build PhantomJS.", ssl_ver))
  }
}