Bioconductor / Rhtslib

HTSlib high-throughput sequencing library as an R package
https://bioconductor.org/packages/Rhtslib
11 stars 12 forks source link

Rhtslib 1.16.2 cannot find libraries (R 3.6.1, Spack installation) #12

Closed jessepharrison closed 5 years ago

jessepharrison commented 5 years ago

Hello,

I am running R 3.6.1 installed via spack on Linux, with both zlib and bzip2 installed (the necessary header files are present, although in non-standard locations). There have been some reports of missing files / libraries for previous Rhtslib versions and I am running into similar problems when trying to install Rhtslib 1.16.2 or the devel version (1.17.6).

If I run the regular BiocManager installation command: BiocManager::install("Rhtslib", lib = libpath, update = FALSE) ... I get an error concerning bzlib.h: cram/cram_io.c:57:10: fatal error: bzlib.h: No such file or directory

This is not fixed by specifying CPPFLAGS, CFLAGS or LDFLAGS in the above installation command by using configure.vars=... .

If I manually change the flags in Makefile and Makefile.Rhtslib, repackage the source files and install using R CMD INSTALL, that fixes the bzlib.h issue but I still run into a further missing library problem. The changes I made consisted of:

It then is able to find bzlib.h but gives the following error: cannot find -lbz2

So far I've found no solution to this problem. I've also tried setting flags directly as part of the R CMD INSTALL command. Would anyone have suggestions as to fixing the issue? The relevant bzip2 directories are also included in my LD_LIBRARY_PATH and PATH.

Aside from problems with installing the R package, I am able to install HTSlib as follows... so it must be something about the R package installation itself.

CFLAGS="-I/path/to/include/" LIBS="-L/path/to/lib -lz -lm -lbz2 -llzma -lcurl" ./configure --prefix=/destination/directory --enable-libcurl

... followed by make && make install.

Best wishes, Jesse Harrison

hpages commented 5 years ago

Hi,

The Rhtslib package doesn't use a configure script because it is assumed that R "knows" where to find the libbz2 library and its header files. This is why using configure.vars=... when installing the package has no effect.

More precisely, the Makefile used to compile Rhtslib on Linux or Mac (Rhtslib/src/htslib-1.7/Makefile.Rhtslib) re-uses the LDFLAGS and CPPFLAGS values set by R itself. This is why they are commented out in Makefile.Rhtslib. You can see the values set by R for these variables by running R CMD config LDFLAGS and R CMD config CPPFLAGS on your system. Here is what I get on my system (64-bit Ubuntu Linux 16.04):

hpages@spectre:~$ R CMD config LDFLAGS
-L/usr/local/lib
hpages@spectre:~$ R CMD config CPPFLAGS
-I/usr/local/include

If R was configured properly, you should be able to use these flags to compile/link the following simple program that uses libbz2:

test.c file

#include <bzlib.h>

int main() {
    BZFILE *file;
    char buf[5];
    file = BZ2_bzopen("some/path", "rb");
    BZ2_bzread(file, buf, 1);
    return 0;
}

Try to compile/link test.c using the LDFLAGS and CPPFLAGS set by R

gcc -Wall `R CMD config CPPFLAGS` test.c `R CMD config LDFLAGS` -lbz2

On my system the a.out executable produced by the above command is linked to the following dynamic libraries:

hpages@spectre:~$ ldd a.out 
    linux-vdso.so.1 =>  (0x00007fffdafad000)
    libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007f200ec6f000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f200e8a5000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f200ee7f000)

As you can see, in my case libbz2 is in a standard location so adding the CPPFLAGS and LDFLAGS to the command I used to compile/link test.c was not necessary (i.e. gcc -Wall test.c -lbz2 also works fine). However, on systems where libbz2 is NOT in a standard location, R should "know" where to find the library and its header files and this should be reflected in the output of R CMD config LDFLAGS and R CMD config CPPFLAGS. If trying to compile/link test.c using the LDFLAGS and CPPFLAGS set by R doesn't work for you, it probably means that R was not configured properly, or that libbz2 was installed to a non-standard location after R was configured.

Could the problem be related to the way R got installed on your system? I know nothing about Spack or installing R via Spack.

H.

jessepharrison commented 5 years ago

Hi,

Thanks for the detailed answer, which helped me troubleshoot this. The R installation I'm using relies on several separately installed libraries (some of which were added only after R itself). I'm working on a cluster environment without root access and, in the end, requesting RPM installations of bzip2-devel and zlib-devel made it possible to go ahead with the installation. For future development work, it would be great if there was a version of Rhtslib available where it's possible to point to different locations using configure.vars. This would make it easier to handle situations where dependencies are installed after R itself (and where RPM installations aren't straight-forward due to restrictions on user rights).

Best wishes, Jesse

hpages commented 5 years ago

Glad you were able to compile Rhtslib on your cluster.

Having an R installation that relies on libraries that were added only after R itself sounds like a recipe for some serious headaches down the road. I'm not sure I understand how R could be configured and work properly if it doesn't "know" where to find the bz2 or lzma libs. Some basic R functionalities depend on these libraries.

Since Rhtslib doesn't use a configure script and I don't plan on adding one for the moment, it's not possible to point to alternative bz2 or lzma locations via configure.vars=... for now. A simple workaround would be to rely on some user-defined environment variables for that and use them in Rhtslib/src/htslib-1.7/Makefile.Rhtslib:

hpages@spectre:~/git.bioconductor.org/software/Rhtslib/src/htslib-1.7$ git diff
diff --git a/src/htslib-1.7/Makefile.Rhtslib b/src/htslib-1.7/Makefile.Rhtslib
index b4b78fb..5a6501f 100644
--- a/src/htslib-1.7/Makefile.Rhtslib
+++ b/src/htslib-1.7/Makefile.Rhtslib
@@ -37,13 +37,13 @@
 # Default libraries to link if configure is not used
 htslib_default_libs = -lz -lm -lbz2 -llzma -lcurl

-CPPFLAGS += -D_FILE_OFFSET_BITS=64
+CPPFLAGS += -D_FILE_OFFSET_BITS=64 $(BZ2_INCLUDE_DIR) ${LZMA_INCLUDE_DIR}
 # TODO: probably update cram code to make it compile cleanly with -Wc++-compat
 # For testing strict C99 support add -std=c99 -D_XOPEN_SOURCE=600
 #CFLAGS   = -g -Wall -O2 -pedantic -std=c99 -D_XOPEN_SOURCE=600 -D__FUNCTION__=__func__
 CFLAGS += -fpic
 EXTRA_CFLAGS_PIC =
-#LDFLAGS  =
+LDFLAGS += $(BZ2_LIB_DIR) ${LZMA_LIB_DIR}
 LIBS     = $(htslib_default_libs)

 prefix      = /usr/local

Would that work?

H.

jessepharrison commented 5 years ago

Yes! That also works (and should also help other users with libraries installed in odd locations...). Many thanks for the help and quick answers, we could close this issue. - Jesse