bnprks / BPCells

Scaling Single Cell Analysis to Millions of Cells
https://bnprks.github.io/BPCells
Other
167 stars 17 forks source link

cannot find hdf5.h when installing into a conda env (windows). #126

Open littlemingone opened 2 months ago

littlemingone commented 2 months ago

I am a windows user, and tried to install BPCells in a conda r environment. R version 4.4.1, installed from conda-forge. Rtools44 installed.

error log:

remotes::install_github("bnprks/BPCells/r")
Using github PAT from envvar GITHUB_PAT. Use `gitcreds::gitcreds_set()` and unset GITHUB_PAT in .Renviron (or elsewhere) if you want to use 
the more secure git credential store instead.
Downloading GitHub repo bnprks/BPCells@HEAD
bnprks-BPCells-d409a68/cpp/bpcells-cpp: Can't create '\\\\?\\C:\\Users\\PC\\AppData\\Local\\Temp\\RtmpygrNhm\\remotes43ec8fb7af5\\bnprks-BPCells-d409a68\\cpp\\bpcells-cpp'
bnprks-BPCells-d409a68/cpp/vendor: Can't create '\\\\?\\C:\\Users\\PC\\AppData\\Local\\Temp\\RtmpygrNhm\\remotes43ec8fb7af5\\bnprks-BPCells-d409a68\\cpp\\vendor'
bnprks-BPCells-d409a68/python/src/bpcells-cpp: Can't create '\\\\?\\C:\\Users\\PC\\AppData\\Local\\Temp\\RtmpygrNhm\\remotes43ec8fb7af5\\bnprks-BPCells-d409a68\\python\\src\\bpcells-cpp'
bnprks-BPCells-d409a68/python/src/vendor: Can't create '\\\\?\\C:\\Users\\PC\\AppData\\Local\\Temp\\RtmpygrNhm\\remotes43ec8fb7af5\\bnprks-BPCells-d409a68\\python\\src\\vendor'
tar.exe: Error exit delayed from previous errors.
These packages have more recent versions available.
It is recommended to update all of them.
Which would you like to update?

1: All
2: CRAN packages only
3: None
4: cpp11     (0.4.7     -> 0.5.0    ) [CRAN]
5: RcppEigen (0.3.4.0.1 -> 0.3.4.0.2) [CRAN]

Enter one or more numbers, or an empty line to skip updates: 3        
-- R CMD build -------------------------------------------------------v  checking for file 'C:\Users\PC\AppData\Local\Temp\RtmpygrNhm\remotes43ec8fb7af5\bnprks-BPCells-d409a68\r/DESCRIPTION'
-  preparing 'BPCells': (2.6s)
v  checking DESCRIPTION meta-information ...
-  cleaning src
-  checking for LF line-endings in source and make files and shell scripts (403ms)
-  checking for empty or unneeded directories (472ms)
-  building 'BPCells_0.2.0.tar.gz'
   Warning: file 'BPCells/cleanup' did not have execute permissions: corrected
   Warning: file 'BPCells/configure' did not have execute permissions: corrected

* installing *source* package 'BPCells' ...
** using staged installation
Testing hdf5 by compiling example program...
tools/h5write.c:19:10: fatal error: hdf5.h: No such file or directory 
   19 | #include "hdf5.h"
      |          ^~~~~~~~
compilation terminated.

Retrying without -lsz flag...
tools/h5write.c:19:10: fatal error: hdf5.h: No such file or directory 
   19 | #include "hdf5.h"
      |          ^~~~~~~~
compilation terminated.

Unable to locate libhdf5. Please install manually or edit compiler flags.
ERROR: configuration failed for package 'BPCells'
* removing 'E:/Program/miniforge3/envs/R-4.4.1/lib/R/library/BPCells' 
Warning messages:
1: In utils::untar(tarfile, ...) :
  'tar.exe -xf "C:\Users\PC\AppData\Local\Temp\RtmpygrNhm\file43ec81a8f17bc.tar.gz" -C "C:/Users/PC/AppData/Local/Temp/RtmpygrNhm/remotes43ec8fb7af5"' returned error code 1
2: In i.p(...) :
  installation of package 'C:/Users/PC/AppData/Local/Temp/RtmpygrNhm/file43ec81aa0269b/BPCells_0.2.0.tar.gz' had non-zero exit status  

R package hdf5r was installed. The Read10X_h5() function from Seurat worked fine. So I have no idea why it cannot find hdf5.h.

I also added path to the hdf5.h file in rtools path E:\Program\rtools44\x86_64-w64-mingw32.static.posix\include to the System PATH but it didn't work.

conda package hdf5 was installed. BPCells still cannot find hdf5.h file.

I also tried in a non-conda R which was installed by cran r installer and it installed BPCells successfully.

By the way, the bnprks-BPCells-d409a68/cpp/bpcells-cpp: Can't create '\\\\?\\C:\\Users\\PC\\AppData\\Local\\Temp\\RtmpygrNhm\\remotes43ec8fb7af5\\bnprks-BPCells-d409a68\\cpp\\bpcells-cpp' warings disappear when I use a terminal with Administrator privileges, so I think it is not the key part.

bnprks commented 2 months ago

Hi @littlemingone, thanks for your question. I won't have access to a windows computer for the next two weeks so I probably won't be able to make fixes myself to support installation within conda on windows until after then.

One other option available for windows is to install a pre-built binary package:

install.packages("BPCells", repos = c("https://bnprks.r-universe.dev", "https://cloud.r-project.org"))

I haven't tested the windows build recently, but there's a good chance it will work.

I was not even aware that conda offered a non-standard R package installation process, so the current configure.win is just designed to work with the standard Rtools setup. The fix might be as simple as incorporating some of the smarter logic from the Mac/Linux configure file, but I won't be able to do it myself for a bit.

In the mean time, I'd suggest either trying the r-universe option I described above or just installing outside of a conda environment which you say worked successfully.

littlemingone commented 2 months ago

Thanks for the quick reply! The prebuilt version works, in the conda env. I can use it now! 😆

To solve the installing problem, I think the easiest way is to provide prebuilt package on CRAN/Bioconductor and anaconda.

99.9% of users will only use BPCells on linux, windows and macos. So it shouldn't be a huge work. Also, given the extreme memory requirements for single-cell analysis, I recommend BPCell to everyone analyzing single-cell data. A prebuilt package will help a lot.

Back to my problem itself. For more detail infomation, I retry the installing with the DEBUG mode.

debug log of BPCells install in conda
 
>  Sys.setenv(BPCELLS_DEBUG_INSTALL="true")

> remotes::install_github("bnprks/BPCells/r")
Using github PAT from envvar GITHUB_PAT. Use `gitcreds::gitcreds_set()` and unset GITHUB_PAT in .Renviron (or elsewhere) if you want to use the more secure 
git credential store instead.
Downloading GitHub repo bnprks/BPCells@HEAD
These packages have more recent versions available.
It is recommended to update all of them.
Which would you like to update?

1: All
2: CRAN packages only
3: None
4: cpp11     (0.4.7     -> 0.5.0    ) [CRAN]
5: RcppEigen (0.3.4.0.1 -> 0.3.4.0.2) [CRAN]

Enter one or more numbers, or an empty line to skip updates: 3
-- R CMD build ---------------------------------------------------------------------------------------------------------------------------------------------v  checking for file 'C:\Users\PC\AppData\Local\Temp\RtmpErVzgN\remotes3d63020a45877\bnprks-BPCells-d409a68\r/DESCRIPTION'
-  preparing 'BPCells':
v  checking DESCRIPTION meta-information ... 
-  cleaning src
-  checking for LF line-endings in source and make files and shell scripts (412ms)
-  checking for empty or unneeded directories (410ms)
-  building 'BPCells_0.2.0.tar.gz'
   Warning: file 'BPCells/cleanup' did not have execute permissions: corrected
   Warning: file 'BPCells/configure' did not have execute permissions: corrected

* installing *source* package 'BPCells' ...
** using staged installation
+ '[' -z true ']'
+ ERR=/dev/stdout
+ set -x
+ ENABLE_INSTALL_COUNTING=yes
+ '[' -n yes ']'
+ curl --silent https://plausible.benparks.net/flask-plausible/bpcells-configure
++ E:/Program/miniforge3/envs/R-4.4.1/lib/R/bin/R CMD config CC
+ CC=x86_64-w64-mingw32-gcc
++ E:/Program/miniforge3/envs/R-4.4.1/lib/R/bin/R CMD config CXX
+ CXX='x86_64-w64-mingw32-g++ -std=gnu++17'
++ E:/Program/miniforge3/envs/R-4.4.1/lib/R/bin/R CMD config CFLAGS
+ CFLAGS='-O2 -Wall -march=x86-64 -mtune=generic'
++ E:/Program/miniforge3/envs/R-4.4.1/lib/R/bin/R CMD config CXXFLAGS
+ CXXFLAGS='-O2 -Wall -march=x86-64 -mtune=generic'
++ E:/Program/miniforge3/envs/R-4.4.1/lib/R/bin/R CMD config LDFLAGS
+ LDFLAGS=
+ echo 'Testing hdf5 by compiling example program...'
Testing hdf5 by compiling example program...
+ HDF5_CFLAGS=
+ HDF5_LIBS='-lhdf5 -lz -lsz'
+ HDF5_OK=
+ x86_64-w64-mingw32-gcc tools/h5write.c -lhdf5 -lz -lsz -o tools/h5write
tools/h5write.c:19:10: fatal error: hdf5.h: No such file or directory
   19 | #include "hdf5.h"
      |          ^~~~~~~~
compilation terminated.
+ '[' -z ']'
+ printf '\n\nRetrying without -lsz flag...\n'

Retrying without -lsz flag...
+ HDF5_LIBS='-lhdf5 -lz'
+ x86_64-w64-mingw32-gcc tools/h5write.c -lhdf5 -lz -o tools/h5write
tools/h5write.c:19:10: fatal error: hdf5.h: No such file or directory
   19 | #include "hdf5.h"
      |          ^~~~~~~~
compilation terminated.
+ '[' -z ']'
+ printf '\n\nUnable to locate libhdf5. Please install manually or edit compiler flags.\n'

Unable to locate libhdf5. Please install manually or edit compiler flags.
+ exit 1
ERROR: configuration failed for package 'BPCells'
* removing 'E:/Program/miniforge3/envs/R-4.4.1/lib/R/library/BPCells'
Warning message:
In i.p(...) :
  installation of package 'C:/Users/PC/AppData/Local/Temp/RtmpErVzgN/file3d6303264
  

And, for confirming that the problem is not cause by some wired system variables or some uncorrect settings on my computer, I tried on another windows machine. In short, the result are same: Compilation work fine in a normally installed R, but cannot find hdf5.h file in a conda R.
By the way, the install script seem ingnores some windows features. If the R was installed in a path containing whitespace like D:\Program Files, the make step will split the path and get error for a wrong path.

debug log of BPCells install in another machine
 
> Sys.setenv(BPCELLS_DEBUG_INSTALL="true")

> remotes::install_github("bnprks/BPCells/r")
Using github PAT from envvar GITHUB_PAT. Use `gitcreds::gitcreds_set()` and unset GITHUB_PAT in .Renviron (or elsewhere) if you want to use the more secure git credential store instead.
Downloading GitHub repo bnprks/BPCells@HEAD
Running `R CMD build`...
* checking for file 'C:\Users\littl\AppData\Local\Temp\Rtmp8QpPFL\remotes276c1ed970c6\bnprks-BPCells-b22be61\r/DESCRIPTION' ... OK
* preparing 'BPCells':
* checking DESCRIPTION meta-information ... OK
* cleaning src
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building 'BPCells_0.2.0.tar.gz'
* installing *source* package 'BPCells' ...
** using staged installation
+ '[' -z true ']'
+ ERR=/dev/stdout
+ set -x
+ ENABLE_INSTALL_COUNTING=yes
+ '[' -n yes ']'
+ curl --silent https://plausible.benparks.net/flask-plausible/bpcells-configure
++ D:/miniforge3/envs/test/lib/R/bin/R CMD config CC
+ CC=x86_64-w64-mingw32-gcc
++ D:/miniforge3/envs/test/lib/R/bin/R CMD config CXX
+ CXX='x86_64-w64-mingw32-g++ -std=gnu++17'
++ D:/miniforge3/envs/test/lib/R/bin/R CMD config CFLAGS
+ CFLAGS='-O2 -Wall -march=x86-64 -mtune=generic'
++ D:/miniforge3/envs/test/lib/R/bin/R CMD config CXXFLAGS
+ CXXFLAGS='-O2 -Wall -march=x86-64 -mtune=generic'
++ D:/miniforge3/envs/test/lib/R/bin/R CMD config LDFLAGS
+ LDFLAGS=
+ echo 'Testing hdf5 by compiling example program...'
Testing hdf5 by compiling example program...
+ HDF5_CFLAGS=
+ HDF5_LIBS='-lhdf5 -lz -lsz'
+ HDF5_OK=
+ x86_64-w64-mingw32-gcc tools/h5write.c -lhdf5 -lz -lsz -o tools/h5write
tools/h5write.c:19:10: fatal error: hdf5.h: No such file or directory
   19 | #include "hdf5.h"
      |          ^~~~~~~~
compilation terminated.
+ '[' -z ']'
+ printf '\n\nRetrying without -lsz flag...\n'

Retrying without -lsz flag...
+ HDF5_LIBS='-lhdf5 -lz'
+ x86_64-w64-mingw32-gcc tools/h5write.c -lhdf5 -lz -o tools/h5write
tools/h5write.c:19:10: fatal error: hdf5.h: No such file or directory
   19 | #include "hdf5.h"
      |          ^~~~~~~~
compilation terminated.
+ '[' -z ']'
+ printf '\n\nUnable to locate libhdf5. Please install manually or edit compiler flags.\n'

Unable to locate libhdf5. Please install manually or edit compiler flags.
+ exit 1
ERROR: configuration failed for package 'BPCells'
* removing 'D:/miniforge3/envs/test/Lib/R/library/BPCells'
Warning message:
In i.p(...) :
  installation of package 'C:/Users/littl/AppData/Local/Temp/Rtmp8QpPFL/file276c279211e/BPCells_0.2.0.tar.gz' had non-zero exit status
  

Personally, I suspect that the system is not calling the files in rtools correctly. My rtools44 was installed outside the conda env due to conda-forge doesn't provide a rtools package. All I know about how R calling Rtools is by an environment variable names RTOOLS44_HOME. I don't know how other R packages find their needed exe files by RTOOLS44_HOME, maybe by some function from pkgbuild? .But the installing of BPCells seems call make or sh.exe from default system terminal directly, so anything outside of $PATH cannot be found. For example, installing will fail at the sh.exe step unless I add rtools_path\usr\bin into my PATH. So i guess maybe the loss of hdf5.h is for same reason? Though it still throw the same error after I added the path of hdf5.h file in rtools into $PATH, $LIB, $INCLUDE.

I am a pure R user and know nothing about C++ or the detail compiler process, so these are all I can do.

bnprks commented 2 months ago

Glad you've got a version you can use in conda. Thank you for the debug logs and pointing out the problem of paths with spaces. That should be a helpful starting point for me once I am back from vacation and can test out with a Windows machine. I'll post any updates here with what I find

bnprks commented 2 months ago

I've had a chance to do some testing on a windows machine, and I think I've identified a set of environment variables that make the installation work for me. This worked on my machine:

old_path <- Sys.getenv("PATH")
conda_path <- Sys.getenv("CONDA_PREFIX")
Sys.setenv(
    PATH=paste0("C:/rtools44/usr/bin", old_path),
    CPATH=file.path(conda_path, "Library/include"),
    LIBRARY_PATH=file.path(conda_path, "Library/lib")
)
remotes::install_github("bnprks/BPCells/r")
  1. I had to adjust the PATH so that a working bash would be found (my system was defaulting to running WSL2 for bash, so I changed it to use bash.exe from rtools44).
  2. I had to add the location to look for header files and compiled libraries within the conda environment (the CPATH and LIBRARY_PATH variables)

This setup also requires having hdf5 installed via conda. Using it for someone else might require adjusting the rtools44 path to match your version and location of rtools (the listed folder should contain bash.exe).

This has a good chance of also working on your setup, though I can't be sure there's not some critical difference between your setup and mine. This isn't quite as clean a solution as I'd like, but hopefully it allows compilation from within conda on windows.

littlemingone commented 2 months ago

The CPATH and LIBRARY_PATH worked, hdf5 was found but another error was threw. This is a newly created conda env with only r-base 4.4.1 and dependencies of BPCells.

R log
 
> old_path <- Sys.getenv("PATH")
  conda_path <- Sys.getenv("CONDA_PREFIX")
  Sys.setenv(
      PATH=paste0("E:/Program/rtools44/usr/bin;", old_path),
      CPATH=file.path(conda_path, "Library/include"),
      LIBRARY_PATH=file.path(conda_path, "Library/lib")
  )

> remotes::install_github("bnprks/BPCells/r")
Using github PAT from envvar GITHUB_PAT. Use `gitcreds::gitcreds_set()` and unset GITHUB_PAT in .Renviron (or elsewhere) if you want to use the more secure git credential store instead.
Downloading GitHub repo bnprks/BPCells@HEAD
bnprks-BPCells-b22be61/cpp/bpcells-cpp: Can't create '\\\\?\\C:\\Users\\PC\\AppData\\Local\\Temp\\RtmpQjUp6Q\\remotes55785ecf4222\\bnprks-BPCells-b22be61\\cpp\\bpcells-cpp'
bnprks-BPCells-b22be61/cpp/vendor: Can't create '\\\\?\\C:\\Users\\PC\\AppData\\Local\\Temp\\RtmpQjUp6Q\\remotes55785ecf4222\\bnprks-BPCells-b22be61\\cpp\\vendor'
bnprks-BPCells-b22be61/python/src/bpcells-cpp: Can't create '\\\\?\\C:\\Users\\PC\\AppData\\Local\\Temp\\RtmpQjUp6Q\\remotes55785ecf4222\\bnprks-BPCells-b22be61\\python\\src\\bpcells-cpp'
bnprks-BPCells-b22be61/python/src/vendor: Can't create '\\\\?\\C:\\Users\\PC\\AppData\\Local\\Temp\\RtmpQjUp6Q\\remotes55785ecf4222\\bnprks-BPCells-b22be61\\python\\src\\vendor'
tar.exe: Error exit delayed from previous errors.
Running `R CMD build`...
* checking for file 'C:\Users\PC\AppData\Local\Temp\RtmpQjUp6Q\remotes55785ecf4222\bnprks-BPCells-b22be61\r/DESCRIPTION' ... OK
* preparing 'BPCells':
* checking DESCRIPTION meta-information ... OK
* cleaning src
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building 'BPCells_0.2.0.tar.gz'
* installing *source* package 'BPCells' ...
** using staged installation
Testing hdf5 by compiling example program...
E:/Program/miniforge3/envs/test/Library/bin/../lib/gcc/x86_64-w64-mingw32/14.1.0/../../../../x86_64-w64-mingw32/bin/ld.exe: cannot find -lsz: No such file or directory
collect2.exe: error: ld returned 1 exit status

Retrying without -lsz flag...
Found working hdf5
HDF5_CFLAGS=''
HDF5_LIBS='-lhdf5 -lz'

Testing C++17 filesystem feature support...
Testing availability of highway SIMD library...
Building highway SIMD library from sourcesrc/vendor/highway/manual-build/build_highway.sh: line 202: ar: command not found
** libs
using C++ compiler: 'x86_64-w64-mingw32-g++.exe (conda-forge gcc 14.1.0-1) 14.1.0'
x86_64-w64-mingw32-g++ -std=gnu++17  -I"E:/Program/miniforge3/envs/test/lib/R/include" -DNDEBUG  -I'E:/Program/miniforge3/envs/test/Lib/R/library/Rcpp/include' -I'E:/Program/miniforge3/envs/test/Lib/R/library/RcppEigen/include'   -I"E:/Program/miniforge3/envs/test/Library/include"  -Ibpcells-cpp -I../tools/highway/include -Ivendor -std=c++17 -DRCPP_EIGEN -DEIGEN_PERMANENTLY_DISABLE_STUPID_WARNINGS -Wno-ignored-attributes -Wno-unknown-pragmas    -O2 -Wall  -march=x86-64 -mtune=generic  -c bitpacking_io.cpp -o bitpacking_io.o
In file included from bitpacking_io.cpp:20:
bpcells-cpp/simd/bp128.h:12:10: fatal error: hwy/base.h: No such file or directory
   12 | #include 
      |          ^~~~~~~~~~~~
compilation terminated.
make: *** [E:/Program/miniforge3/envs/test/lib/R/etc/x64/Makeconf:296: bitpacking_io.o] Error 1
ERROR: compilation failed for package 'BPCells'
* removing 'E:/Program/miniforge3/envs/test/Lib/R/library/BPCells'
Warning messages:
1: In utils::untar(tarfile, ...) :
  'tar.exe -xf "C:\Users\PC\AppData\Local\Temp\RtmpQjUp6Q\file5578212e2ff7.tar.gz" -C "C:/Users/PC/AppData/Local/Temp/RtmpQjUp6Q/remotes55785ecf4222"' returned error code 1
2: In i.p(...) :
  installation of package 'C:/Users/PC/AppData/Local/Temp/RtmpQjUp6Q/file55786b624d5d/BPCells_0.2.0.tar.gz' had non-zero exit status

  

And I think maybe you should put a ; after your rtools path like paste0("C:/rtools44/usr/bin;", old_path),.