Bioconductor / Rhtslib

HTSlib high-throughput sequencing library as an R package
https://bioconductor.org/packages/Rhtslib
11 stars 12 forks source link

Failure to link to Rhtslib when library path contains spaces #13

Closed grimbough closed 5 years ago

grimbough commented 5 years ago

I'm encountering issues with linking to Rhtslib when the library path has spaces in it. It seems like wrapping the library and include paths in " when calling pkgconfig() is not sufficient and the space is still interpreted as a break when linking. The example below should allow you to reproduce this by installing a temporary version of Rhtslib and then Rsamtools:

dir.create('/tmp/r-lib-with space')
.libPaths('/tmp/r-lib-with space/')

install.packages('BiocManager', lib = '/tmp/r-lib-with space/')
BiocManager::install('Rsamtools', lib = '/tmp/r-lib-with space/')

The final step eventually fails with:

gcc -I"/home/msmith/Applications/R/R-3.6.1/include" -DNDEBUG `echo 'Rhtslib::pkgconfig("PKG_CPPFLAGS")'| "/home/msmith/Applications/R/R-3.6.1/bin/R" --vanilla --slave` -I"/tmp/r-lib-with space/Rhtslib/include" -I"/tmp/r-lib-with space/S4Vectors/include" -I"/tmp/r-lib-with space/IRanges/include" -I"/tmp/r-lib-with space/XVector/include" -I"/tmp/r-lib-with space/Biostrings/include" -I/usr/local/include  -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno -c Biostrings_stubs.c -o Biostrings_stubs.o
gcc: error: space/Rhtslib/include": No such file or directory

I'm seeing the same problem with rhdf5/Rhdf5lib (https://github.com/grimbough/rhdf5/issues/45) and can't figure the solution.

It seems like something gets interpreted differently when passing this via the Makevars file, because the same path is used during the installation of Rhtslib itself without issue e.g.

gcc -I"/home/msmith/Applications/R/R-3.6.1/include" -DNDEBUG -I"/tmp/r-lib-with space/Rhtslib/include" -I"/tmp/r-lib-with space/zlibbioc/include" -I/usr/local/include  -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno -c R_init_Rhtslib.c -o R_init_Rhtslib.o

I'd be grateful for any insights!

hpages commented 5 years ago

I think it boils down to this: how do you define shell variable aa for this to work?

hpages@spectre:~$ mkdir "my dir"
hpages@spectre:~$ ls -l -d "my dir"
drwxrwxr-x 2 hpages hpages 4096 Oct  9 09:32 my dir
hpages@spectre:~$ aa='-d "my dir"'
hpages@spectre:~$ ls -l $aa
ls: cannot access '"my': No such file or directory
ls: cannot access 'dir"': No such file or directory
hpages@spectre:~$ ls -l -d my\ dir
drwxrwxr-x 2 hpages hpages 4096 Oct  9 09:36 my dir
hpages@spectre:~$ aa='-d my\ dir'
hpages@spectre:~$ ls -l $aa
ls: cannot access 'my\': No such file or directory
ls: cannot access 'dir': No such file or directory

I'm trying to refresh my memory about some important subtleties of shell parameter expansion and word splitting by reading man bash but I have a bad feeling about this.

The other direction I want to explore is to see if there is a way to have PKG_CPPFLAGS get evaluated before being injected in the gcc command. Right now it's replaced literally with

`echo 'Rhtslib::pkgconfig("PKG_CPPFLAGS")'| "/home/hpages/R/R-3.6.r76454/bin/R" --vanilla --slave`

and this gets evaluated later by the shell where the gcc command is executed. If PKG_CPPFLAGS could somehow be evaluated before the injection in the gcc command, I think that would solve the problem.

hpages commented 5 years ago

I think evaluating PKG_CPPFLAGS early can be done with something like:

RHTSLIB_LIBS=$(shell echo 'Rhtslib::pkgconfig("PKG_LIBS")'|\
    "${R_HOME}/bin/R" --vanilla --slave)

That's actually what we do in Rsamtools on Windows (Rsamtools/src/Makevars.win). Trying this on Linux and Mac now...

grimbough commented 5 years ago

Thanks. I've spent the afternoon trying to read up on how Make files variables are evaluated and passed to shells. I got to the same conclusion as you regarding the literal insertion of'Rhtslib::pkgconfig("PKG_CPPFLAGS")'...

grimbough commented 5 years ago

Neat, I think that works.

I added $(info $$PKG_LIBS is [${PKG_LIBS}]) to Makevars to print the value during package installation, and before it gave the literal command, but now it's:

$PKG_LIBS is ["-L/tmp/R-lib temp/Rhdf5lib/lib" -lhdf5_cpp -lhdf5 -lszip -lz]

hpages commented 5 years ago

yeah, just tried this on Linux and it works. The neat thing about this is that Rsamtools/src/Makevars.win is no longer needed (the only difference between Rsamtools/src/Makevars and Rsamtools/src/Makevars.win was exactly that: literal substitution vs early evaluation). I think R CMD check will complain that $(shell ...) is not portable but AFAIK it has never caused problems in practice. I'd be interesting to know what's the portable alternative.

About to commit this change to Rsmatools and VariantAnnotation. There is about a dozen packages that link to Rhtslib and all of them would benefit from this simple change.

mtmorgan commented 5 years ago

Back-ticks for (non-Windows) portability? I guess shell is sanctioned on windows (and Makevars.win) because there is only one supported tool chain. Only the messenger.

hpages commented 5 years ago

@mtmorgan The problem is that back-ticks and $(shell ...) are not equivalent. Back-ticks is what we've been using so far but the problem with them is that the evaluation is delayed which is what is causing problems when paths in PKG_LIBS or PKG_CPPFLAGS contain whitespaces.

So the question is: what's the portable alternative of $(shell ...) that provides the same behavior as $(shell ...), i.e. that provides early evaluation i.e. evaluation of PKG_LIBS and PKG_CPPFLAGS before their injection in the gcc command?

grimbough commented 5 years ago

On the top of $(shell ...) Writing R Extensions says:

If you really must require GNU make, declare it in the DESCRIPTION file by SystemRequirements: GNU make

which I already have in Rhdf5lib and I don't think I've ever encountered an issue where that requirement turned out to be the culprit, so perhaps that's sufficient.

hpages commented 5 years ago

Good! A small price to pay to be considered a good citizen. Just one more thing that all packages that compile and link against Rhtslib will also be supposed to do (I need to update Rhtslib's vignette to reflect that). Most of them are outlaws at the moment: they use $(shell ...) on Windows but don't have SystemRequirements: GNU make.

hpages commented 5 years ago

Unfortunately SystemRequirements: GNU make does not make the R CMD check warning go away. It's unfortunate that packages that compile and link against Rhtslib have to choose between not supporting whitespaces in Rhtslib's installation paths vs using $(shell ...) and getting a spurious warning.

hpages commented 5 years ago

Bug submitted: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17626