Open pshashk opened 7 years ago
Thanks!
I've been thinking about RcppArmadillo, maybe creating some kind of ArmaFloat package. I may need Dirk's help to pull that off, and he's extremely busy, so no promises. But it's something I want to pursue.
I hadn't thought about interfacing with python, but that's a great idea. Actually, deep learning's push into ever lower precisions was one of my primary motivations for creating the package. I'm not very familiar with how reticulate works, but it should be possible to add some kind of shim. I'll look into it.
I have a project where I also use R's integer matrices to store floats. Actual number crunching done in C++ with Armadillo. I will try to adopt my code to use float
package. So this could become an example of interfacing float
with C++ code.
Cool! Does RcppArmadillo ship with the float part of Armadillo? My guess was they didn't because of the lack of single precision blas/lapack.
Everything works out of the box. RcppArmadillo ships everything from Armadillo. And Armadillo itself optionally uses external blas / lapack(it contains its own reference implementation I guess). Here is proof of concept branch of other package - https://github.com/dselivanov/reco/tree/float. More than 2x faster than double precision and 2x less ram.
I think it's because you're using high performance BLAS/LAPACK, so the symbol resolution is taking place from your $(LAPACK_LIBS)
and $(BLAS_LIBS)
lines in Makevars. When I try to build it with an R version linked with R's reference blas (which don't include the single precision functions), the build fails with undefined symbol: sgesvx_
.
Obviously the best thing to do is link with high performance BLAS implementations that have all of the symbols already. The CRAN acceptable solution is to link with float/libs/float.so if it built the reference blas/lapack instead (the LinkingTo field does not do this, sadly). I'll make this easier to do this weekend.
Hm. Mb need more sophisticated configure script (which I don't know how to write yet). Won't it work after removal of $(LAPACK_LIBS)
and $(BLAS_LIBS)
from makevars? In this case I believe Armadillo should use its reference implementation.
I think Armadillo's non-blas/lapack implementation is controlled by c/c++ preprocessor stuff, which would be painful to handle. But I'm exporting LAPACK (plus some other minor things) to a static library, so no autoconf necessary (thankfully). I have it in a local version that I'll push soon when I'm sure it's working, but basically the reco Makevars would look like:
SLAPACK_LIB = `${R_HOME}/bin/Rscript -e "float:::ldflags()"`
PKG_CXXFLAGS = $(SHLIB_OPENMP_CFLAGS) -DARMA_64BIT_WORD
PKG_LIBS = $(SHLIB_OPENMP_CFLAGS) $(SHLIB_OPENMP_CXXFLAGS) $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS) $(SLAPACK_LIB)
CXX_STD = CXX11
I'll hopefully finish it up tomorrow.
I've created float:::ldflags()
which when used as in the above example Makevars file will set the caller package to link with float.so
from the float package. Originally I was going to use a static library, but a shared library really makes more sense.
Basically, if float BLAS/LAPACK get compiled when building the float package (e.g. for binary CRAN distributions), then those symbols will be there and the linker can resolve the lookup for any package using, for example, armadillo. Otherwise (in the case where high performance BLAS/LAPACK are used) those symbols will never get built in the first place and the linking may even be unnecessary. The advantages are that it's the same process regardless of what BLAS/LAPACK libraries are used, and there are some extra things in float.so
, like R_NaNf
and NA_FLOAT
(analogues of R_NaN
and NA_REAL
).
I plan to write this up more carefully in the package vignette soon. I have an example package that uses this here. I've also tested it with reco and it appears to be working in my setup that doesn't have high performance BLAS/LAPACK.
Thanks you very much for detailed investigation. Will try and report back.
@wrathematics finally I've tried Makevars you proposed. It works great with minimal adjustment - need to add -L
to the beginning of the SLAPACK_LIB
:
SLAPACK_LIB = "-L"`${R_HOME}/bin/Rscript -e "float:::ldflags()"`
PKG_CXXFLAGS = $(SHLIB_OPENMP_CFLAGS) -DARMA_64BIT_WORD
PKG_LIBS = $(SHLIB_OPENMP_CFLAGS) $(SHLIB_OPENMP_CXXFLAGS) $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS) $(SLAPACK_LIB)
CXX_STD = CXX11
Tested on OS X and Ubuntu 16.04
I managed to get the compiler to build a new package dll on my Windows system but when it tries to load the new package it crashes with the system error
The program can't start because float.dll is missing from your computer. Try reinstalling the program to fix this problem.
Thoughts?
I thought this could be resolved using the -Wl,-rpath
but I guess rpath
is not supported on Windows.
I had exactly the sames problem. Win7
Sequence: rsparse would not build. Did devtools install of float.
devtools::install_github("dselivanov/float"). It built. .DLLs are present on the file system.
Could not build rsparse using devtool github install devtools::install_github("dselivanov/rsparse")
downloaded the zip and worked to install_local
install_local("C:/Users/reefej/R/win-library/3.4/rsparse", force=TRUE)
No good. Choked on many things.
SLAPACK_LIB =
${R_HOME}/bin/Rscript -e "float:::ldflags()"
PKG_CXXFLAGS = -std=c++11 $(SHLIB_OPENMP_CFLAGS) -DARMA_64BIT_WORD -O3 -fopenmp -ffast-math -march=native -mavx PKG_LIBS = $(SHLIB_OPENMP_CFLAGS) $(SHLIB_OPENMP_CXXFLAGS) $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS) $(SLAPACK_LIB) CXX_STD = CXX11
It built but could not find float.dll.
library(rparse)
model = WRMF$new(rank = 8)
After Makevars updates, I expect the problem float install and build does not register the .dlls with R environment. So you have to 'show' R and windows where to find them. since the .dlls have the same name I dropped the i386 and x64 versions into separate dirs. Do not know implications at run time, but windows will find the first' one in the search path. A windows tool can tell you which gets loaded. The build knew the difference though. I will check how to 'register the float dll dirs with the R environment.
I get a compiler error when I try to install rsparse, but it's unrelated to float (but the windows environment that I have (limited) access to is really weird, so it could easily be a configuration problem on my end). I don't think there's any way around using a static lib for windows and possibly mac as well. I'm working on that as we speak.
The latest rsparse doesn't link with float, so I think you're trying an older version. You might try the latest version.
I was able to build an older version (this one) on linux by linking against a static library generated in the float package. I get an ld error on Windows that I don't understand though. I've added the changes to a new branch.
I could use some windows expertise if you have a minute @snoweye.
@dselivanov @wrathematics Is the change (without linking with float
) temporary or permanent? I need to decide to invest time on it or not because the work is not easy nor trivial nor guarantee success (high risk to fail) and potential very long time.
Well regardless of rsparse, I need to get something reliable that works on windows.
What I have now appears to work on mac (still running some tests), and Linux continues to be problem free. I just pushed an update for windows, but it's still broken. At first I was trying this and this (which you wrote), but couldn't get it to work. Now I basically just copied what I do on *nix and it appears to work, but it has some missing symbol problems. I really don't know how to do this on windows.
For rsparse I pulled a zip from the from the master branch.
I am not sure why it wants float.
From: Drew Schmidt notifications@github.com Sent: Tuesday, July 17, 2018 5:22 PM To: wrathematics/float Cc: jamespr615; Comment Subject: Re: [wrathematics/float] Interfacing with other packages (#4)
The latest rsparse doesn't link with float, so I think you're trying an older version. You might try the latest version.
I was able to build an older version (this onehttps://github.com/dselivanov/rsparse/tree/f488decc4d3cf3d2735a19ada96ba4559250a8d4) on linux by linking against a static library generated in the float package. I get an ld error on Windows that I don't understand though. I've added the changes to a new branchhttps://github.com/wrathematics/float/tree/static.
I could use some windows expertise if you have a minute @snoweyehttps://github.com/snoweye.
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/wrathematics/float/issues/4#issuecomment-405732362, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AHcqz8JvLc0QUPu04aomVqXqzpFhg-WNks5uHlWLgaJpZM4PK30T.
@snoweye this is temporary - once we will figure out reliable way on hot to link to float
and update float
on CRAN, I will return dependency back.
https://github.com/dselivanov/rsparse/releases/tag/v0.3.3.1 is on CRAN, so it can showcase on how to link to float (essentially everything is done by float).
Cool!
I'd eventually like to crack dynamic linking on all platforms. This would avoid file size notes for downstream packages, but it would also simplify the use of high performance BLAS in some situations. It shouldn't change anything for a package author if I can ever get it right. It's on the list, anyway.
Very nice package. Thank you for fixing this significant drawback of R language. In my opinion, these two features would make float matrices even more useful:
Wrapper for RcppArmadillo Mat\<float> class. This would make float matrices suitable for performing in-place manipulations from the Rcpp side (e.g., SGD matrix decomposition).
Being able to convert float matrix to NumPy equivalent to communicate with python libraries via reticulate (e.g., to reduce the memory requirements when preprocessing data for tensorflow or keras deep learning frameworks).