amiraa127 / Dense_HODLR

HODLR Package
Other
10 stars 6 forks source link

Porting Dense_HODLR to R #2

Open stnava opened 9 years ago

stnava commented 9 years ago

Continuation from https://github.com/amiraa127/Dense_HODLR/issues/1

@amiraa127, @sivaramambikasaran - i am beginning this port and would like your help in defining a basic and "R familiar" API for just one key feature of Dense_HODLR. For example, a fast solution along the lines of:

https://github.com/amiraa127/Dense_HODLR/blob/e3362a8c897920fdd114ec5ff5770d3108ba28f3/src/helperFunctions.cpp

So specifically, implement an R interface to: solverSoln = denseHODLR.recLU_Solve(RHS);

Am also interested in various aspects of: https://github.com/amiraa127/Dense_HODLR/blob/e3362a8c897920fdd114ec5ff5770d3108ba28f3/include/lowRank.hpp

I am open to other suggestions, of course. Ideally, I'd like a satisfying example that shows the speed-up and or unique value of HODLR matrix methods relative to extant R features.

amiraa127 commented 9 years ago

In order to implement an R interface to solverSoln = denseHODLR.recLU_Solve(RHS), you first need to have a wrapper that converts an R matrix into the Eigen::MatrixXd data structure. Once you have that implemented, working with the package is fairly straight forward.

stnava commented 9 years ago

Ok - I have this interface going, in a beta form. What would be the key features of Dense_HODLR on which to focus? I.e. features that have the widest reach / interest for the largest number of users?

amiraa127 commented 9 years ago

Nice! Right now, the most useful functions is recLU_Solve which solves the linear system. You can also use the variety of the low-rank approximation schemes independently. I'm currently working on implementing determinant and matrix-vector product functionality which will be added soon.

Amir

amiraa127 commented 9 years ago

We are also working on least squares and least norm but those might take a bit longer to add.

stnava commented 9 years ago

least squares / least norm would be fantastic ... but i understand about the time involved. i will be patient.

for the extant methods:

1) what are the restrictions on recLU_Solve? is there a good / communicable example that will highlight the power of the method? i'd like to do the data preparation and documentation in R, then call recLU_Solve at the critical step .... i can wrap additional function as needed.

2) what other solvers are of value? iterative_solve?

3) for the low-rank approximation scheme, what might be the best function on which to base a benchmark against a native R implementation? again, i'd like to do as much data prep in R but am happy to wrap functions as needed.

again - thanks for your help!

brian

On Sun, Nov 23, 2014 at 2:44 PM, AmirHossein Aminfar < notifications@github.com> wrote:

We are also working on least squares and least norm but those might take a bit longer to add.

— Reply to this email directly or view it on GitHub https://github.com/amiraa127/Dense_HODLR/issues/2#issuecomment-64131352.

amiraa127 commented 9 years ago

In terms of the power of this method, there are several examples that we have benchmarked. For example you can look at http://arxiv.org/abs/1403.5337. It all comes down to how well you can approximate your matrix as an HODLR matrix. If you are able to represent your problem in terms of an HODLR matrix accurately, this method becomes extremely powerful. One other important factor you need to consider is the low-rank approximation scheme you are using. Partial pivoting ACA is the default low-rank approximation method and is the fastest but is not very accurate in some cases. Other methods such as SVD are available but are not recommended due to high cost.

One other capability of this package is to use an HODLR solver as a preconditioner to an iterative solver. Currently, I have this implemented for a simple fixed point iterative method (see http://arxiv.org/abs/1403.5337). I should mention that I will soon expand this capability to more sophisticated methods like GMRES. stay tuned :)

How are you constructing the HODLR matrix? Do you have the full dense matrix? Or is your data generated by a kernel function?

stnava commented 9 years ago

thanks - i havent yet gotten to read http://arxiv.org/pdf/1403.5337v2.pdf , but will do so soon. mostly i've looked at : "Fast Direct Methods for Gaussian Processes and the Analysis of NASA Kepler Mission Data"

at this point, the full dense matrix is the most interesting case, b/c it is general. whether hodlr structure suits my data is an open question. i suspect it will but it might need some massaging ( and i will need to understand hodlr matrices a little more fully in order to figure this out and write good documentation ).

so, how about this ... i will work on some documentation pdf and build some wrappers for in progress hodlr functions which i will hope to fill in with "real" hodlr functionality later and with motivating reproducible benchmark computations. what i hope we can access is:

and then put some statistical methods around these that show their use in data analysis .... this is a longer term view for which the core work is essential.

anyway, this leaves me plenty to do for now - keep me posted.

brian

On Sun, Nov 23, 2014 at 11:06 PM, AmirHossein Aminfar notifications@github.com wrote:

In terms of the power of this method, there are several examples that we have benchmarked. For example you can look at http://arxiv.org/abs/1403.5337. It all comes down to how well you can approximate your matrix as an HODLR matrix. If you are able to represent your problem in terms of an HODLR matrix accurately, this method becomes extremely powerful. One other important factor you need to consider is the low-rank approximation scheme you are using. Partial pivoting ACA is the default low-rank approximation method and is the fastest but is not very accurate in some cases. Other methods such as SVD are available but are not recommended due to high cost.

One other capability of this package is to use an HODLR solver as a preconditioner to an iterative solver. Currently, I have this implemented for a simple single point iterative method (see http://arxiv.org/abs/1403.5337). I should mention that I will soon expand this capability to more sophisticated methods like GMRES. stay tuned :)

How are you constructing the HODLR matrix? Do you have the full dense matrix? Or is your data generated by a kernel function?

— Reply to this email directly or view it on GitHub.

amiraa127 commented 9 years ago

Just wanted to let you know that I've added the fast determinant estimation as per "Fast Direct Methods for Gaussian Processes and the Analysis of NASA Kepler Mission Data". Also there is an additional solve method called recSM_Solve which uses the Sherman Morrison formula.

Enjoy :)

Amir

sivaramambikasaran commented 9 years ago

Thanks Amir. Could you add this (http://arxiv.org/pdf/1403.6015.pdf) as a reference for the determinant?

sivaramambikasaran commented 9 years ago

@stnava Brian, could you let me know what sort of covariance matrices you are dealing with? Are they the usual ones from Gaussian process? (In which case, these should have the HODLR property)

amiraa127 commented 9 years ago

Sure, Can you send me the BibteX entry for this paper and your p-HSS paper?

Thanks Regards Amir

On Nov 24, 2014, at 1:06 PM, Sivaram Ambikasaran notifications@github.com wrote:

Thanks Amir. Could you add this (http://arxiv.org/pdf/1403.6015.pdf) as a reference for the determinant?

— Reply to this email directly or view it on GitHub.

sivaramambikasaran commented 9 years ago

@amiraa127 For the solver for points on higher dimension, how do you sort the points? Is the sorting routine available to the user? I used KDTree, which if you need you can take it from here (https://github.com/sivaramambikasaran/HODLR/blob/master/examples/KDTree.cpp)

sivaramambikasaran commented 9 years ago

Here they are:

Determinat: @article{ambikasaran2014fastdet, title={Fast Direct Methods for {G}aussian Processes and the Analysis of {NASA} {K}epler Mission Data}, author={Ambikasaran, Sivaram and Foreman-Mackey, Daniel and Greengard, Leslie F. and Hogg, David W. and O'Neil, Michael}, journal={arXiv preprint arXiv:1403.6015}, year={2014} }

pHSS: @article{ambikasaran2013fast, title={An $\mathcal{O}({N} \log {N})$ Fast Direct Solver for Partial Hierarchically Semi-Separable Matrices}, author={Ambikasaran, Sivaram and Darve, Eric F.}, journal={Journal of Scientific Computing}, volume={57}, number={3}, pages={477--501}, year={2013}, publisher={Springer} }

Symmetric factorization: @article{ambikasaran2014fastsym, title={Fast symmetric factorization of hierarchical matrices with applications}, author={Ambikasaran, Sivaram and O'Neil, Michael}, journal={arXiv preprint arXiv:1405.0223}, year={2014} }

amiraa127 commented 9 years ago

done

stnava commented 9 years ago

minor compile issue in Dense_HODLR:

CMake Error at src/CMakeLists.txt:19 (add_library):

Cannot find source file:

perturbI.cpp

brian

On Mon, Nov 24, 2014 at 4:22 PM, AmirHossein Aminfar < notifications@github.com> wrote:

done

— Reply to this email directly or view it on GitHub https://github.com/amiraa127/Dense_HODLR/issues/2#issuecomment-64266788.

amiraa127 commented 9 years ago

Fixed. I forgot to add perturbI.cpp into the repo. The issue is now fixed. Let me know of any other issues.

Amir

stnava commented 9 years ago

thanks! got it. now lots of doc/wrapping work - will be slow to get to this due to upcoming holidays etc.

@sivaramambikasaran - linear and squared exp are most common so i am hopeful. issue is that the dimension of the points can be relatively high - e.g. 10^1 - 10^3 .... i'd like to be able to use a bit of prior knowledge to guide the decomposition (if possible) but that will be later.

brian

On Mon, Nov 24, 2014 at 4:38 PM, AmirHossein Aminfar notifications@github.com wrote:

Fixed. I forgot to add perturbI.cpp into the repo. The issue is now fixed. Let me know of any other issues.

Amir

— Reply to this email directly or view it on GitHub.

stnava commented 9 years ago

COMP: perturbI.cpp:43:1: warning: control reaches end of non-void function [-Wreturn-type] }

amiraa127 commented 9 years ago

This will not have any impact on the solve. I will fixed this later on.

Amir

amiraa127 commented 9 years ago

Fixed