databio / bulker

Manager for multi-container computing environments
https://bulker.io
BSD 2-Clause "Simplified" License
24 stars 2 forks source link

"/bin/bash: clang: command not found" on OSX 11 #71

Open lwaldron opened 3 years ago

lwaldron commented 3 years ago

I'm currently unable to install packages requiring compiled C code when using my bulker (0.6.0) scripts on macOS Big Sur 11.1. I noticed the problem just after my recent upgrade to macOS 11, but can't guarantee that's when the problem started. Here is an example:

% R -e 'BiocManager::install("S4Vectors")'
WARNING: Published ports are discarded when using host network mode

R version 4.0.3 (2020-10-10) -- "Bunny-Wunnies Freak Out"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Bioconductor version 3.12 (BiocManager 1.30.10), ?BiocManager::install for help
> BiocManager::install("S4Vectors")
Bioconductor version 3.12 (BiocManager 1.30.10), R 4.0.3 (2020-10-10)
Installing package(s) 'S4Vectors'
trying URL 'https://bioconductor.org/packages/3.12/bioc/src/contrib/S4Vectors_0.28.1.tar.gz'
Content type 'application/x-gzip' length 660079 bytes (644 KB)
==================================================
downloaded 644 KB

Bioconductor version 3.12 (BiocManager 1.30.10), ?BiocManager::install for help
* installing *source* package ‘S4Vectors’ ...
** using staged installation
** libs
clang -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c AEbufs.c -o AEbufs.o
/bin/bash: clang: command not found
make: *** [/usr/local/lib/R/etc/Makeconf:172: AEbufs.o] Error 127
ERROR: compilation failed for package ‘S4Vectors’
* removing ‘/usr/local/lib/R/host-site-library/S4Vectors’
* restoring previous ‘/usr/local/lib/R/host-site-library/S4Vectors’

The downloaded source packages are in
    ‘/tmp/RtmpnLQS50/downloaded_packages’
Old packages: 'Biobase', 'GenomicRanges', 'IRanges', 'multtest', 'Rhtslib',
  'XVector', 'zlibbioc', 'rlang'
Warning message:
In install.packages(...) :
  installation of package ‘S4Vectors’ had non-zero exit status
> 

I can fix the problem by getting rid of the " --user=$(id -u):$(id -g)" line in my bulker script, e.g. the following completes normally:

grep -v user `which R` > Rtest 
./Rtest -e 'BiocManager::install("S4Vectors")'

Here is my bulker R script (from the waldronlab/bioconductor bulker crate):

% cat `which R`                                 
#!/bin/sh

docker run --rm --init \
  -it --volume=/Users/lwaldron/R/bioc-release:/usr/local/lib/R/host-site-library -e DISABLE_AUTH=true -p 8787:8787 -v /Users/lwaldron:/home/rstudio \
  --user=$(id -u):$(id -g) \
  --network="host" \
  --env "DISPLAY" \
  --volume "$HOME:$HOME" \
  --volume="/etc/group:/etc/group:ro" \
  --volume="/Users/lwaldron/templates/mac_passwd:/etc/passwd:ro" \
  --volume="/etc/shadow:/etc/shadow:ro"  \
  --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" \
  --workdir="`pwd`" \
  waldronlab/bioconductor:release R "$@"%    

FWIW, there is a host clang installation, which isn't in the PATH after entering a bulker shell:

waldronlab/metagenomics|~ % which clang
/usr/local/opt/llvm/bin/clang
waldronlab/metagenomics|~ % _R
Starting interactive docker shell for image 'waldronlab/bioconductor:release' and command 'R'
WARNING: Published ports are discarded when using host network mode
lwaldron@docker-desktop:~$ which clang
lwaldron@docker-desktop:~$ 

clang still isn't in the PATH when removing the --user flag, even though I can now compile packages, which I don't understand. It's also not in the path when repeating on my Linux machine, even though I have no problems there.

waldronlab/metagenomics|~ %  grep -v user `which _R` > _Rtest 
waldronlab/metagenomics|~ % ./_Rtest                         
Starting interactive docker shell for image 'waldronlab/bioconductor:release' and command 'R'
WARNING: Published ports are discarded when using host network mode
root@docker-desktop:/Users/lwaldron# which clang
root@docker-desktop:/Users/lwaldron# R -e 'BiocManager::install("S4Vectors")'

R version 4.0.3 (2020-10-10) -- "Bunny-Wunnies Freak Out"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Bioconductor version 3.12 (BiocManager 1.30.10), ?BiocManager::install for help
> BiocManager::install("S4Vectors")
Bioconductor version 3.12 (BiocManager 1.30.10), R 4.0.3 (2020-10-10)
Installing package(s) 'S4Vectors'
trying URL 'https://bioconductor.org/packages/3.12/bioc/src/contrib/S4Vectors_0.28.1.tar.gz'
Content type 'application/x-gzip' length 660079 bytes (644 KB)
==================================================
downloaded 644 KB

Bioconductor version 3.12 (BiocManager 1.30.10), ?BiocManager::install for help
* installing *source* package ‘S4Vectors’ ...
** using staged installation
** libs
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c AEbufs.c -o AEbufs.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c DataFrame_class.c -o DataFrame_class.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c Hits_class.c -o Hits_class.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c LLint_class.c -o LLint_class.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c List_class.c -o List_class.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c R_init_S4Vectors.c -o R_init_S4Vectors.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c Rle_class.c -o Rle_class.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c Rle_utils.c -o Rle_utils.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c SEXP_utils.c -o SEXP_utils.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c SimpleList_class.c -o SimpleList_class.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c anyMissing.c -o anyMissing.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c character_utils.c -o character_utils.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c eval_utils.c -o eval_utils.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c hash_utils.c -o hash_utils.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c integer_utils.c -o integer_utils.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c logical_utils.c -o logical_utils.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c map_ranges_to_runs.c -o map_ranges_to_runs.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c raw_utils.c -o raw_utils.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c safe_arithm.c -o safe_arithm.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c sort_utils.c -o sort_utils.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c subsetting_utils.c -o subsetting_utils.o
gcc -I"/usr/local/lib/R/include" -DNDEBUG   -I/usr/local/include   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c vector_utils.c -o vector_utils.o
gcc -shared -L/usr/local/lib/R/lib -L/usr/local/lib -o S4Vectors.so AEbufs.o DataFrame_class.o Hits_class.o LLint_class.o List_class.o R_init_S4Vectors.o Rle_class.o Rle_utils.o SEXP_utils.o SimpleList_class.o anyMissing.o character_utils.o eval_utils.o hash_utils.o integer_utils.o logical_utils.o map_ranges_to_runs.o raw_utils.o safe_arithm.o sort_utils.o subsetting_utils.o vector_utils.o -L/usr/local/lib/R/lib -lR
installing to /usr/local/lib/R/host-site-library/00LOCK-S4Vectors/00new/S4Vectors/libs
** R
** inst
** byte-compile and prepare package for lazy loading
Bioconductor version 3.12 (BiocManager 1.30.10), ?BiocManager::install for help
in method for ‘normalizeSingleBracketReplacementValue’ with signature ‘"List"’: no definition for class “List”
Creating a new generic function for ‘expand.grid’ in package ‘S4Vectors’
Creating a new generic function for ‘findMatches’ in package ‘S4Vectors’
Creating a generic function for ‘setequal’ from package ‘base’ in package ‘S4Vectors’
in method for ‘coerce’ with signature ‘"Hits","DFrame"’: no definition for class “DFrame”
Creating a generic function for ‘as.factor’ from package ‘base’ in package ‘S4Vectors’
Creating a generic function for ‘tabulate’ from package ‘base’ in package ‘S4Vectors’
Creating a generic function for ‘cov’ from package ‘stats’ in package ‘S4Vectors’
Creating a generic function for ‘cor’ from package ‘stats’ in package ‘S4Vectors’
Creating a generic function for ‘smoothEnds’ from package ‘stats’ in package ‘S4Vectors’
Creating a generic function for ‘runmed’ from package ‘stats’ in package ‘S4Vectors’
Creating a generic function for ‘nchar’ from package ‘base’ in package ‘S4Vectors’
Creating a generic function for ‘substr’ from package ‘base’ in package ‘S4Vectors’
Creating a generic function for ‘substring’ from package ‘base’ in package ‘S4Vectors’
Creating a generic function for ‘chartr’ from package ‘base’ in package ‘S4Vectors’
Creating a generic function for ‘tolower’ from package ‘base’ in package ‘S4Vectors’
Creating a generic function for ‘toupper’ from package ‘base’ in package ‘S4Vectors’
Creating a generic function for ‘sub’ from package ‘base’ in package ‘S4Vectors’
Creating a generic function for ‘gsub’ from package ‘base’ in package ‘S4Vectors’
Creating a generic function for ‘nlevels’ from package ‘base’ in package ‘S4Vectors’
in method for ‘coerce’ with signature ‘"data.table","DFrame"’: no definition for class “data.table”
Creating a generic function for ‘complete.cases’ from package ‘stats’ in package ‘S4Vectors’
** help
*** installing help indices
** building package indices
Bioconductor version 3.12 (BiocManager 1.30.10), ?BiocManager::install for help
** installing vignettes
** testing if installed package can be loaded from temporary location
Bioconductor version 3.12 (BiocManager 1.30.10), ?BiocManager::install for help
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
Bioconductor version 3.12 (BiocManager 1.30.10), ?BiocManager::install for help
** testing if installed package keeps a record of temporary installation path
* DONE (S4Vectors)

The downloaded source packages are in
    ‘/tmp/Rtmp6fVfj4/downloaded_packages’
Old packages: 'Biobase', 'GenomicRanges', 'IRanges', 'multtest', 'Rhtslib',
  'XVector', 'zlibbioc', 'rlang'
> 

Just FYI, other than my username being root and home directory being / while invoking docker without the --user flag, file permissions seem to be handled correctly as the host user. Happy to give you remote access to my macOS machine if it'll help.

lwaldron commented 3 years ago

Also, I noticed that my little hack of getting rid of the --user flag doesn't help for my rstudio-server command from the waldronlab/bioconductor crate, which doesn't contain the user flag but nonetheless has the same clang problem:

waldronlab/metagenomics|~ % cat `which rstudio-server`
#!/bin/sh

docker run --rm --init \
  --volume=/Users/lwaldron/R/bioc-release:/usr/local/lib/R/host-site-library -e DISABLE_AUTH=true -p 8787:8787 -v /Users/lwaldron:/home/rstudio \
  --env "DISPLAY" \
  --volume "$HOME:$HOME" \
  --workdir="`pwd`" \
  waldronlab/bioconductor:release   "$@"%                   
nsheff commented 3 years ago

Hmm, is it just a case of needing to not map the user? Are you using no_user mode?

https://bulker.databio.org/en/latest/advanced_templates/

I don't quite understand the final thing about the rstudio-server one though, why that would matter...

lwaldron commented 3 years ago

I'm using the create directly from https://github.com/databio/hub.bulker.io/blob/master/waldronlab/bioconductor.yaml, so I guess that's why the user mapping isn't present in my rstudio-server bulker script? Here's the full diff on those scripts:

waldronlab/metagenomics|~ % diff `which R` `which rstudio-server`
4,5c4
<   -it --volume=/Users/lwaldron/R/bioc-release:/usr/local/lib/R/host-site-library -e DISABLE_AUTH=true -p 8787:8787 -v /Users/lwaldron:/home/rstudio \
<   --network="host" \
---
>   --volume=/Users/lwaldron/R/bioc-release:/usr/local/lib/R/host-site-library -e DISABLE_AUTH=true -p 8787:8787 -v /Users/lwaldron:/home/rstudio \
8,11d6
<   --volume="/etc/group:/etc/group:ro" \
<   --volume="/Users/lwaldron/templates/mac_passwd:/etc/passwd:ro" \
<   --volume="/etc/shadow:/etc/shadow:ro"  \
<   --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" \
13c8
<   waldronlab/bioconductor:release R "$@"
---
>   waldronlab/bioconductor:release   "$@"
\ No newline at end of file

(oops, note, I had manually deleted the `--user`` flag from the R script above)

nsheff commented 3 years ago

This seems like some kind of problem with clang in the waldronlab/bioconductor:release image

You could just add no_user to that image's settings in your bulker config. That solves the first problem just so you don't have to remove --user manually. But, then it will only work on MacOS, right? And it doesn't seem to really solve the underlying issue...

Because I can't explain why behavior is different when you use rstudio-server. That's the same image... so I guess to me this seems like a problem with the image that it outside of bulker. I mean, can you install stuff in that image just using straight-up docker? That's the issue to solve here. I think the user thing gives some kind of a hint that it may have to do with permissions or file locations or something. Unfortunately, I have no experience with clang.