wrathematics / float

Single precision (float) matrices for R.
Other
46 stars 12 forks source link

Interfacing with other packages #4

Open pshashk opened 7 years ago

pshashk commented 7 years ago

Very nice package. Thank you for fixing this significant drawback of R language. In my opinion, these two features would make float matrices even more useful:

  1. Wrapper for RcppArmadillo Mat\<float> class. This would make float matrices suitable for performing in-place manipulations from the Rcpp side (e.g., SGD matrix decomposition).

  2. Being able to convert float matrix to NumPy equivalent to communicate with python libraries via reticulate (e.g., to reduce the memory requirements when preprocessing data for tensorflow or keras deep learning frameworks).

wrathematics commented 7 years ago

Thanks!

I've been thinking about RcppArmadillo, maybe creating some kind of ArmaFloat package. I may need Dirk's help to pull that off, and he's extremely busy, so no promises. But it's something I want to pursue.

I hadn't thought about interfacing with python, but that's a great idea. Actually, deep learning's push into ever lower precisions was one of my primary motivations for creating the package. I'm not very familiar with how reticulate works, but it should be possible to add some kind of shim. I'll look into it.

dselivanov commented 6 years ago

I have a project where I also use R's integer matrices to store floats. Actual number crunching done in C++ with Armadillo. I will try to adopt my code to use float package. So this could become an example of interfacing float with C++ code.

wrathematics commented 6 years ago

Cool! Does RcppArmadillo ship with the float part of Armadillo? My guess was they didn't because of the lack of single precision blas/lapack.

dselivanov commented 6 years ago

Everything works out of the box. RcppArmadillo ships everything from Armadillo. And Armadillo itself optionally uses external blas / lapack(it contains its own reference implementation I guess). Here is proof of concept branch of other package - https://github.com/dselivanov/reco/tree/float. More than 2x faster than double precision and 2x less ram.

wrathematics commented 6 years ago

I think it's because you're using high performance BLAS/LAPACK, so the symbol resolution is taking place from your $(LAPACK_LIBS) and $(BLAS_LIBS) lines in Makevars. When I try to build it with an R version linked with R's reference blas (which don't include the single precision functions), the build fails with undefined symbol: sgesvx_.

Obviously the best thing to do is link with high performance BLAS implementations that have all of the symbols already. The CRAN acceptable solution is to link with float/libs/float.so if it built the reference blas/lapack instead (the LinkingTo field does not do this, sadly). I'll make this easier to do this weekend.

dselivanov commented 6 years ago

Hm. Mb need more sophisticated configure script (which I don't know how to write yet). Won't it work after removal of $(LAPACK_LIBS) and $(BLAS_LIBS) from makevars? In this case I believe Armadillo should use its reference implementation.

wrathematics commented 6 years ago

I think Armadillo's non-blas/lapack implementation is controlled by c/c++ preprocessor stuff, which would be painful to handle. But I'm exporting LAPACK (plus some other minor things) to a static library, so no autoconf necessary (thankfully). I have it in a local version that I'll push soon when I'm sure it's working, but basically the reco Makevars would look like:

SLAPACK_LIB = `${R_HOME}/bin/Rscript -e "float:::ldflags()"`

PKG_CXXFLAGS = $(SHLIB_OPENMP_CFLAGS) -DARMA_64BIT_WORD
PKG_LIBS = $(SHLIB_OPENMP_CFLAGS) $(SHLIB_OPENMP_CXXFLAGS) $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS) $(SLAPACK_LIB)
CXX_STD = CXX11

I'll hopefully finish it up tomorrow.

wrathematics commented 6 years ago

I've created float:::ldflags() which when used as in the above example Makevars file will set the caller package to link with float.so from the float package. Originally I was going to use a static library, but a shared library really makes more sense.

Basically, if float BLAS/LAPACK get compiled when building the float package (e.g. for binary CRAN distributions), then those symbols will be there and the linker can resolve the lookup for any package using, for example, armadillo. Otherwise (in the case where high performance BLAS/LAPACK are used) those symbols will never get built in the first place and the linking may even be unnecessary. The advantages are that it's the same process regardless of what BLAS/LAPACK libraries are used, and there are some extra things in float.so, like R_NaNf and NA_FLOAT (analogues of R_NaN and NA_REAL).

I plan to write this up more carefully in the package vignette soon. I have an example package that uses this here. I've also tested it with reco and it appears to be working in my setup that doesn't have high performance BLAS/LAPACK.

dselivanov commented 6 years ago

Thanks you very much for detailed investigation. Will try and report back.

dselivanov commented 6 years ago

@wrathematics finally I've tried Makevars you proposed. It works great with minimal adjustment - need to add -L to the beginning of the SLAPACK_LIB:

SLAPACK_LIB = "-L"`${R_HOME}/bin/Rscript -e "float:::ldflags()"`

PKG_CXXFLAGS = $(SHLIB_OPENMP_CFLAGS) -DARMA_64BIT_WORD
PKG_LIBS = $(SHLIB_OPENMP_CFLAGS) $(SHLIB_OPENMP_CXXFLAGS) $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS) $(SLAPACK_LIB)
CXX_STD = CXX11

Tested on OS X and Ubuntu 16.04

cdeterman commented 6 years ago

I managed to get the compiler to build a new package dll on my Windows system but when it tries to load the new package it crashes with the system error

The program can't start because float.dll is missing from your computer. Try reinstalling the program to fix this problem.

Thoughts?

cdeterman commented 6 years ago

I thought this could be resolved using the -Wl,-rpath but I guess rpath is not supported on Windows.

jamespr615 commented 6 years ago

I had exactly the sames problem. Win7

Sequence: rsparse would not build. Did devtools install of float.

devtools::install_github("dselivanov/float"). It built. .DLLs are present on the file system.

Could not build rsparse using devtool github install devtools::install_github("dselivanov/rsparse")

downloaded the zip and worked to install_local

install_local("C:/Users/reefej/R/win-library/3.4/rsparse", force=TRUE)

No good. Choked on many things.

It built but could not find float.dll.

library(rparse)
model = WRMF$new(rank = 8)

After Makevars updates, I expect the problem float install and build does not register the .dlls with R environment. So you have to 'show' R and windows where to find them. since the .dlls have the same name I dropped the i386 and x64 versions into separate dirs. Do not know implications at run time, but windows will find the first' one in the search path. A windows tool can tell you which gets loaded. The build knew the difference though. I will check how to 'register the float dll dirs with the R environment.

wrathematics commented 6 years ago

I get a compiler error when I try to install rsparse, but it's unrelated to float (but the windows environment that I have (limited) access to is really weird, so it could easily be a configuration problem on my end). I don't think there's any way around using a static lib for windows and possibly mac as well. I'm working on that as we speak.

wrathematics commented 6 years ago

The latest rsparse doesn't link with float, so I think you're trying an older version. You might try the latest version.

I was able to build an older version (this one) on linux by linking against a static library generated in the float package. I get an ld error on Windows that I don't understand though. I've added the changes to a new branch.

I could use some windows expertise if you have a minute @snoweye.

snoweye commented 6 years ago

@dselivanov @wrathematics Is the change (without linking with float) temporary or permanent? I need to decide to invest time on it or not because the work is not easy nor trivial nor guarantee success (high risk to fail) and potential very long time.

wrathematics commented 6 years ago

Well regardless of rsparse, I need to get something reliable that works on windows.

What I have now appears to work on mac (still running some tests), and Linux continues to be problem free. I just pushed an update for windows, but it's still broken. At first I was trying this and this (which you wrote), but couldn't get it to work. Now I basically just copied what I do on *nix and it appears to work, but it has some missing symbol problems. I really don't know how to do this on windows.

wrathematics commented 6 years ago

You could also try testing with this (needs this) or kazaam. But I can't get openmp or MPI to work in my busted windows vm :[

jamespr615 commented 6 years ago

For rsparse I pulled a zip from the from the master branch.

I am not sure why it wants float.


From: Drew Schmidt notifications@github.com Sent: Tuesday, July 17, 2018 5:22 PM To: wrathematics/float Cc: jamespr615; Comment Subject: Re: [wrathematics/float] Interfacing with other packages (#4)

The latest rsparse doesn't link with float, so I think you're trying an older version. You might try the latest version.

I was able to build an older version (this onehttps://github.com/dselivanov/rsparse/tree/f488decc4d3cf3d2735a19ada96ba4559250a8d4) on linux by linking against a static library generated in the float package. I get an ld error on Windows that I don't understand though. I've added the changes to a new branchhttps://github.com/wrathematics/float/tree/static.

I could use some windows expertise if you have a minute @snoweyehttps://github.com/snoweye.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/wrathematics/float/issues/4#issuecomment-405732362, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AHcqz8JvLc0QUPu04aomVqXqzpFhg-WNks5uHlWLgaJpZM4PK30T.

dselivanov commented 6 years ago

@snoweye this is temporary - once we will figure out reliable way on hot to link to float and update float on CRAN, I will return dependency back.

snoweye commented 6 years ago

@wrathematics @dselivanov Please check my changes to the float, kazaam and rsparse. They install on my native Windows. Thanks.

dselivanov commented 5 years ago

https://github.com/dselivanov/rsparse/releases/tag/v0.3.3.1 is on CRAN, so it can showcase on how to link to float (essentially everything is done by float).

wrathematics commented 5 years ago

Cool!

I'd eventually like to crack dynamic linking on all platforms. This would avoid file size notes for downstream packages, but it would also simplify the use of high performance BLAS in some situations. It shouldn't change anything for a package author if I can ever get it right. It's on the list, anyway.