UCSF-HPC / pilot-testing

0 stars 0 forks source link

NOTES: Missing software and libraries #5

Open HenrikBengtsson opened 7 years ago

HenrikBengtsson commented 7 years ago

I'll keep this issue open and use it to add core software and libraries that I find to be missing. Feel free to mention more in comments and I'll add them to the list here in this top post.

Compute hosts:

Critical / show stoppers

Non-critical

Build hosts:

Critical / show stoppers

Non-critical

Reference:

I was told about http://statmath.wu.ac.at/AASC/debian/binary-amd64/Packages, which shows a list of Debian libraries used by the CRAN maintainers (CRAN is the main R package repository) and also https://github.com/r-hub/sysreqsdb.

HenrikBengtsson commented 7 years ago

jlbl wrote on Feb 14:

I assume the pdf stuff is only needed on the interactive node, not the compute nodes, right?

Although I discovered this by running R CMD check, pdflatex and all of LaTeX can certainly be needed by other software for generating PDF reports on the fly. So, I think they can be useful also on the nodes.

For now, although it's useful if you add them, you can also just keep eye out here and I can reach out to your whenever I / we hit real show stoppers.

HenrikBengtsson commented 7 years ago

@jlbl, could you please install libxml2-devel? It's needed in order to install R package xml2.

HenrikBengtsson commented 7 years ago

@jjbl, another one: libcurl-devel. For the record, it's needed to install R package RCurl - if missing one gets configure error:

checking for curl-config... no
Cannot find curl-config
ERROR: configuration failed for package 'RCurl'
jlbl commented 7 years ago

On Tue, 21 Mar 2017 at 6:22pm, Henrik Bengtsson wrote

@jjbl, another one: libcurl-devel. For the record, it's needed to install R package RCurl - if missing one gets configure error:


checking for curl-config... no
Cannot find curl-config
ERROR: configuration failed for package 'RCurl'

Done on login node.

-- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF

jlbl commented 7 years ago

On Tue, 21 Mar 2017 at 4:39pm, Henrik Bengtsson wrote

@jlbl, could you please install libxml2-devel? It's needed in order to install R package xml2.

Done on login node.

-- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF

jlbl commented 7 years ago

On Tue, 21 Mar 2017 at 4:37pm, Henrik Bengtsson wrote

I'll keep this issue open and use it to add core software and libraries that I find to be missing. Feel free to mention more in comments and I'll add them to the list here in this top post.

  • [ ] qpdf (needed by R CMD check)
  • [ ] pdflatex (needed by R CMD check)

Done on all the hosts.

-- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF

HenrikBengtsson commented 7 years ago

Thxs. @jlbl, another one for R:

> install.packages("openssl")
[...]
Configuration failed because openssl was not found. Try installing:
[...]
    openssl-devel (package on e.g. Fedora, CentOS and RHEL)
HenrikBengtsson commented 7 years ago

There'll be quite more, but I post them as I run into them:

> install.packages("magick")
Configuration failed because Magick++ was not found. Try installing:
[...]
 * rpm: ImageMagick-c++-devel (Fedora, CentOS, RHEL)
jlbl commented 7 years ago

On Wed, 22 Mar 2017 at 12:09pm, Henrik Bengtsson wrote

Thxs. @jlbl, another one for R:


> install.packages("openssl")
[...]
Configuration failed because openssl was not found. Try installing:
[...]
   openssl-devel (package on e.g. Fedora, CentOS and RHEL)

Done on the login host.

-- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF

jlbl commented 7 years ago

On Wed, 22 Mar 2017 at 12:11pm, Henrik Bengtsson wrote

There'll be quite more, but I post them as I run into them:


> install.packages("magick")
Configuration failed because Magick++ was not found. Try installing:
[...]
* rpm: ImageMagick-c++-devel (Fedora, CentOS, RHEL)

Done on the login host.

-- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF

HenrikBengtsson commented 7 years ago

Since you're installing *-devel libraries on build ("login") hosts only, does that mean you're also making sure the corresponding run-time libraries are available on the compute hosts?

It looks like Rscript -e "library(openssl)" works on compute hosts, but Rscript -e "library(magick)" gives "Error in library(magick) : there is no package called 'magick'. Execution halted" on compute host, but works on build host.

The above got me to try to inspect what libraries are available on the compute hosts, by submitting:

$ qsub -j yes -b yes ldconfig -p

but it comes back with empty output:

$ cat ldconfig.o25290
$ ll ldconfig.o25290
-rw-r--r--. 1 hb lsd 0 Mar 22 12:54 ldconfig.o25290

Any ideas?

HenrikBengtsson commented 7 years ago

Another one:

> install.packages("git2r")
[...]
   To build with SSH support, please install:
[...]
     libssh2-devel (package on e.g. Fedora, CentOS and RHEL)
jlbl commented 7 years ago

On Wed, 22 Mar 2017 at 1:06pm, Henrik Bengtsson wrote

Another one:


> install.packages("git2r")
[...]
  To build with SSH support, please install:
[...]
    libssh2-devel (package on e.g. Fedora, CentOS and RHEL)

Do you have a suite of packages you'd like to install? If so, could you try them all and submit all the package requests at once? The drip-drip is a bit torturous...

-- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF

HenrikBengtsson commented 7 years ago

FYI, there are 12,000+ R package on CRAN and Bioconductor these days. There are a set of core libraries that covers most use cases, but I've always discovered them as I needed them.

I'm now trying to identify a "set of libraries that covers 95% of all R packages" so that you can install them all in one go. I've already got some pointers by different R folks (unfortunately mostly for Debian; so need to identify corresponding library names for Centos).

PS. I've got about ~2,000 R packages in my current setup. I don't think you wanna install all those, eh?

jlbl commented 7 years ago

On Wed, 22 Mar 2017 at 1:00pm, Henrik Bengtsson wrote

Since you're installing *-devel libraries on build ("login") hosts only, does that men you're also making sure the corresponding run-time libraries are available on the compute hosts?

It looks like Rscript -e "library(openssl)" works on compute hosts, but Rscript -e "library(magick)" gives "Error in library(magick) : there is no package called 'magick'. Execution halted" on compute host, but works on build host.

Fixed.

The above got me to try to inspect what libraries are available on the compute hosts, by submitting:

$ qsub -j yes -b yes ldconfig -p

but it comes back with empty output:

$ cat ldconfig.o25290
$ ll ldconfig.o25290
-rw-r--r--. 1 hb lsd 0 Mar 22 12:54 ldconfig.o25290

Any ideas?

Well this is odd. I've trying some simple scripts and some commands (e.g. hostname) just aren't generating output. Now why is that...

-- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF

jlbl commented 7 years ago

On Thu, 23 Mar 2017 at 3:37pm, Henrik Bengtsson wrote

FYI, there are 12,000+ R package on CRAN and Bioconductor these days. There are a set of core libraries that covers most use cases, but I've always discovered them as I needed them.

I'm now trying to identify a "set of libraries that covers 95% of all R packages" so that you can install them all in one go. I've already got some pointers by different R folks (unfortunately mostly for Debian; so need to identify corresponding library names for Centos).

PS. I've got about ~2,000 R packages in my current setup. I don't think you wanna install all those, eh?

Sorry if I wasn't clear. I'm not trying to get things setup so that anyone can install almost anything from CRAN and/or Bioconductor without problem (and I don't think that should be our goal). I thought that you had a stable of packages you were trying to get up and running for your needs, and that you could try installing them and seeing which devel packages were missing. Given your installed base (ouch), how about picking a subset you need for any immediate testing and letting me know if anything is missing for them?

-- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF

jlbl commented 7 years ago

On Thu, 23 Mar 2017 at 4:16pm, Joshua Baker-LePain wrote

On Wed, 22 Mar 2017 at 1:00pm, Henrik Bengtsson wrote

The above got me to try to inspect what libraries are available on the compute hosts, by submitting:

 $ qsub -j yes -b yes ldconfig -p

but it comes back with empty output:

 $ cat ldconfig.o25290
 $ ll ldconfig.o25290
 -rw-r--r--. 1 hb lsd 0 Mar 22 12:54 ldconfig.o25290

Any ideas?

Well this is odd. I've trying some simple scripts and some commands (e.g. hostname) just aren't generating output. Now why is that...

It's not clear why this started happening, but it's now fixed. For now. I think.

-- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF

HenrikBengtsson commented 7 years ago

Here's a first list of installs for Centos that's needed that I identified from a freash Docker Centos 7 image.
Some of them are probably already installed on wynton, but I yum install -y should just skip those if so. This far I've only worried about built-time dependencies, i.e. I'm not yet worrying about run-time deps needed on the compute nodes:

bzip2-devel
cairo pango-devel
curl-devel
gcc-c++
gcc-gfortran
java-1.8.0-openjdk-*
libicu-devel
libjpeg-devel
libpng-devel
libtiff-devel
libX11-devel libXt-devel
pcre-devel
readline-devel
texinfo
texlive-latex-bin-bin
valgrind
xz-devel
zlib-devel
epel-release
pandoc pandoc-citeproc
gdal-devel
ImageMagick-c++-devel
libGLU-devel
libssh2-devel
libwebp-devel
libxml2-devel
mariadb-devel
openmpi-devel
openssl-devel
protobuf-devel
v8-314-devel
HenrikBengtsson commented 7 years ago

@jjbl, did you have time to install the new set of built-in-time libraries I listed in https://github.com/UCSF-HPC/pilot-testing/issues/5#issuecomment-292670935?

jlbl commented 7 years ago

On Fri, 7 Apr 2017 at 3:56pm, Henrik Bengtsson wrote

Here's a first list of installs for Centos that's needed that I identified from a freash Docker Centos 7 image. Some of them are probably already installed on wynton, but I yum install -y should just skip those if so. This far I've only worried about built-time dependencies, i.e. I'm not yet worrying about run-time deps needed on the compute nodes:

I installed (or made sure they were already installed) all of the libraries on the login host, and made sure the relevant non-devel packages were installed on the nodes. Let me know what doesn't work.

-- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF

HenrikBengtsson commented 7 years ago

@jlbl, here's another set of libraries missing, which are needed in order to install essential R packages from source:

gdal-devel
proj-devel
proj-epsg
proj-nad
gsl-devel
netcdf-devel
libsndfile-devel
fftw3-devel
gtk2-devel
gmp-devel

There is also:

jags4-devel

which I could only locate for Centos 7 on http://download.opensuse.org/repositories/home:/cornell_vrdc/ (source http://mcmc-jags.sourceforge.net/), i.e.

yum install -y http://download.opensuse.org/repositories/home:/cornell_vrdc/CentOS_7/x86_64/jags4-devel-4.1.0-65.2.x86_64.rpm
jlbl commented 7 years ago

On Sat, 22 Apr 2017 at 8:06pm, Henrik Bengtsson wrote

@jlbl, here's another set of libraries missing, which are needed in order to install essential R packages from source:

gdal-devel
proj-devel
proj-epsg
proj-nad
gsl-devel
netcdf-devel
libsndfile-devel
fftw3-devel
gtk2-devel
gmp-devel

Done, plus the related non -devel packages on the compute nodes.

There is also:

jags4-devel

which I could only locate for Centos 7 on http://download.opensuse.org/repositories/home:/cornell_vrdc/ (source http://mcmc-jags.sourceforge.net/), i.e.

Yeah, I'm not a huge fan of installing random RPMs found about the web system-wide. For now, could you use a version compiled in your $HOME? Down the road, this looks like a good candidate for a shared local repository of custom-compiled libraries and apps...?

-- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF

HenrikBengtsson commented 7 years ago

Lmod - missing dependencies

I am attempting to set up Lmod (Lua Environment Modules) and am missing a few dependencies. I'm following the instructions at http://lmod.readthedocs.io/en/latest/030_installing.html.

The following software / libraries are needed:

$ rpm -qa | grep lua
lua-posix-5.1.7-1.el6.x86_64
lua-5.1.4-4.1.el6.x86_64
lua-filesystem-1.4.2-1.el6.x86_64
lua-devel-5.1.4-4.1.el6.x86_64

You will also need the libtcl and tcl packages as well.

@jlbl, could you please install the above?

Validation / knowing when req'd libraries are available

Running the configure of Lmod currently gives:

$ /netapp/home/hb/repositories/ucsf-wynton/apps/manual/lmod/Lmod-7.4/configure --prefix=/tmp/${USER}/lmod
[...]
checking for valid Lua version... 5.1
checking for lua modules: posix

Error: The follow lua module(s) are missing:  posix

You can not run Lmod without:  posix
jlbl commented 7 years ago

On Sat, 3 Jun 2017 at 10:02pm, Henrik Bengtsson wrote

Lmod - missing dependencies

I am attempting to set up Lmod (Lua Environment Modules) and am missing a few dependencies. I'm following the instructions at http://lmod.readthedocs.io/en/latest/030_installing.html.

The following software / libraries are needed:

$ rpm -qa | grep lua
lua-posix-5.1.7-1.el6.x86_64
lua-5.1.4-4.1.el6.x86_64
lua-filesystem-1.4.2-1.el6.x86_64
lua-devel-5.1.4-4.1.el6.x86_64

You will also need the libtcl and tcl packages as well.

@jlbl, could you please install the above?

Are all the packages necessary on all the nodes? Or can we leave some (e.g. lua-devel) off of the compute nodes and put them only on the interactive nodes?

-- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF

HenrikBengtsson commented 7 years ago

The lua-devel package should only be needed to build Lmod, so that's should not be needed on the compute nodes. But I suspect the others will be used by run-time Lmod which will be called from within job scripts as:

module load my_favorite_software/version
jlbl commented 7 years ago

On Mon, 5 Jun 2017 at 9:47pm, Henrik Bengtsson wrote

The lua-devel package should only be needed to build Lmod, so that's should not be needed on the compute nodes. But I suspect the others will be used by run-time Lmod which will be called from within job scripts as:

module load my_favorite_software/version

Alright, I've installed the packages on all the nodes and qb3-login1. Let me know if anything is still missing.

-- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF

HenrikBengtsson commented 7 years ago

Here's another suite of devel libraries needed for some key R packages:

cyrus-sasl-devel
geos-devel
gpgme-devel
hiredis-devel
leptonica-devel
librsvg2-devel
libsodium-devel
libudunits2-devel   ## hmm... not sure if this exists (error is missing "udunits2")
libxslt-devel
mpfr-devel
openssl-devel
poppler-cpp-devel
postgresql-devel
redland-devel
tcl-devel
tk-devel
tesseract-devel
unixodbc-devel
zeromq3-devel

For now I don't know exactly which R packages that need those devel libraries also need the corresponding runtime libraries to run. To be on the safe side, I guess the compute nodes should have ditto without -devel installed. (why I asked Issue #7)

PS. I got Lmod to install, so it seems we've got all requires lua-* modules needed - thxs for that.

HenrikBengtsson commented 7 years ago

UPDATE 2017-06-08: I've added tcl-devel and tk-devel to the 2017-06-07 list of missing libraries.

jlbl commented 7 years ago

On Thu, 8 Jun 2017 at 4:56pm, Henrik Bengtsson wrote

UPDATE 2017-06-08: I've added tcl-devel and tk-devel to the 2017-06-07 list of missing libraries.

Done.

-- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF

HenrikBengtsson commented 7 years ago

Thxs; confirming that these latest libraries made it possible to install another 150 (1%) of the 11,000 R package on CRAN - all in all I've got ~10,500 of CRAN installed. I still haven't validated run-time on nodes, I think we have a pretty decent base on commonly-needed libraries (for R and hopefully also Python too) now.

UPDATE: 941 (97.6%) out of 964 Bioconductor package could now be installed "out-of-the-box".

HenrikBengtsson commented 6 years ago

Here is another set of core software that I think should be available to all users by default:

All of the above exist via yum install.