StatCan / aaw-kubeflow-containers

Containers built to be used with Kubeflow for Data Science
Other
24 stars 21 forks source link

Investigate curl and rstudio #583

Closed bryanpaget closed 9 months ago

bryanpaget commented 9 months ago

It appears we have an issue with the installed versions of openssl. According to my research, RStudio wants to use the OS version of openssl but the R packages we have installed with conda want to use the openssl package installed by conda.

  call: dyn.load(file, DLLpath = DLLpath, ...)
  error: unable to load shared object '/home/jovyan/R/r-packages-4.3.2/curl/libs/curl.so':
  /lib/x86_64-linux-gnu/libssl.so.3: version `OPENSSL_3.2.0' not found (required by /opt/conda/lib/libcurl.so.4)

References:

Someone else has reported the same issue about 1 year ago as of this writing and they failed to resolve the issue.

bryanpaget commented 9 months ago

What version of openssl does RStudio use?

(base) jovyan@bryan-rstudio-o-0:~$ ldd /usr/lib/rstudio-server/bin/rserver | grep ssl   
        libssl.so.3 => /lib/x86_64-linux-gnu/libssl.so.3 (0x00007f2943e2d000)
(base) jovyan@bryan-rstudio-o-0:~$

What version of openssl does the system have installed?

(base) jovyan@bryan-rstudio-o-0:~$  dpkg -l | grep openssl
ii    openssl    3.0.2-0ubuntu1.13    amd64    Secure Sockets Layer toolkit - cryptographic utility
(base) jovyan@bryan-rstudio-o-0:~$

What version of openssl does conda have?

(base) jovyan@bryan-rstudio-o-0:~$ conda list openssl
# packages in environment at /opt/conda:
#
# Name                    Version                   Build  Channel
openssl                   3.2.0                hd590300_1    conda-forge
pyopenssl                 23.3.0             pyhd8ed1ab_0    conda-forge
r-openssl                 2.1.1             r43hb353fa6_0    conda-forge
(base) jovyan@bryan-rstudio-o-0:~$ 
bryanpaget commented 9 months ago

Background

When I first encountered this issue with RStudio and Curl, I was able to find a workaround by telling RStudio to use wget for downloading packages instead of curl. This only works for RStudio's package installer. Now this issue has cropped up again with another package called SASR because it calls curl.

Problems loading SASR

When you install SASR and try to load it, you'll get this error:

Error: package or namespace load failed for ‘SASR’:
 .onAttach failed in attachNamespace() for 'SASR', details:
  call: dyn.load(file, DLLpath = DLLpath, ...)
  error: unable to load shared object '/home/jovyan/R/r-packages-4.3.2/curl/libs/curl.so':
  /lib/x86_64-linux-gnu/libssl.so.3: version `OPENSSL_3.2.0' not found (required by /opt/conda/lib/libcurl.so.4)

Screenshot

Image

SASR seems to load

That being said, SASR's functions do seem to load.

Image

Use of curl in SASR

I managed to isolate the code that is using curl, it's located in the utils.R file.

#' Check for updates when loading
#'
#' @keywords internal
.onAttach <- function(...) {
  check_package_version(
    description_url = "https://gitlab.k8s.cloud.statcan.ca/EDLP/r-packages/packages/SASR/-/raw/main/DESCRIPTION?ref_type=heads",
    readme_url = "https://gitlab.k8s.cloud.statcan.ca/EDLP/r-packages/packages/SASR/-/raw/main/README.md?ref_type=heads",
    name = "SASR"
  )
}

Testing removing curl calls

I managed to fork the package and remove the calls to curl and this results in SASR not emitting any errors during loading:

Image

Recommendations / Conclusion

Since this is a difficult bug, I recommend adding a guard to check_package_version() to prevent the error from popping up until I can solve this RStudio and Curl compatibility issue.

bryanpaget commented 9 months ago

Thanks @vexingly for finding the following solution: