Closed stevengj closed 10 years ago
Setting maxiter=10000
in eigs
fixes the problem, so I guess it is just converging slowly. But it would be good to have a clearer error message, at least.
(It would also probably converge faster if there were a way to exploit the fact that my matrix is symmetric positive definite. Related to #4476.)
We should certainly give a better message. That is easy to fix.
I don't think there is a way to exploit spd property in arpack - only the fact that it is symmetric.
If you are doing inverse iteration then you can use Cholesky factorization for SPD matrices.
A basic problem here is that eigs
is over-typed. We should do duck-typing: support passing any object (not just an AbstractArray
) that supports the necessary operations (size
, \
, *
?). That way we could pass factorization objects or other abstract linear operations.
It is tempting to think of AbstractMatrix
as the objects that support \
and *
, cf. Jutho's post at the mailing list. At least as long as it is not clearly defined what AbstractMatrix
is.
By the way. Not directly related to this issue but as a consequence of the example of this issue: eigs seems pretty slow. With the laplacian example I get
@time eigs(Ai,nev=1,which="SM")
elapsed time: 457.095567119 seconds (18300352424 bytes allocated)
and with a pure Julia implementation
@time Arnoldi.eigs(Ai, 1, 2000, eps())
elapsed time: 10.058114537 seconds (6387928 bytes allocated)
1-element Array{Float64,1}:
5.76436
I wonder what is happening. Can you post your implementation? What happens in matlab or octave? @vtjnash fixed some Fortran calling issues which could potentially explain this.
@stevengj we should be able to exploit the spd property for free due to the use of . The polyalgorithm may need some tweaking.
The memory usage is pretty insane too... 17GiB!
The implementation is here https://gist.github.com/andreasnoackjensen/6963157. I only did it to learn how the Lanczos method works so it is very simple. I have defined A_mul_B!
for sparse matrices to be allocation free like gemv
which explains the big difference in allocated memory, but not the whole time difference. I tweaked the arnoldi code to use my A_mul_B!
instead of *
and now I get
julia> @time eigs(Ai,nev=1,which="SM")[1]
elapsed time: 382.070997098 seconds (85137664 bytes allocated)
1-element Array{Float64,1}:
5.76436
so much of the reallocation is avoided but the timing is still not good. I think I have made Ai
to be identical in MATLAB and there I get
>> tic;eigs(Ai,1,'sm');toc
Elapsed time is 3.371547 seconds.
so the difference is huge.
579a005c995df57cd92f0dc90e6bda2b25158de6 gives us a better error message.
@JeffBezanson I have a suspicion that type inference is not working as well as it ought to here, which might be slowing down eigs
, by taking a lot of time in the sparse matvec.
Can you give me an example call that you think runs too slowly?
@stevengj's example above is one. There eigs
takes 457 seconds and a Julia implementation of the same calculation takes 5.76 second.
To narrow it down a bit, should I focus on the product of Ai
and a random dense vector?
Ok, there aren't any type inference problems in sparse matvec. I tried hoisting the accesses to A.colptr
etc. but that didn't do very much. (I'm looking at linalg/sparse.jl:14)
We do need some codegen improvements around 1-d arrays, but I don't expect any more than a factor of 2 from that (probably not even that much).
It would be good to check the number of iterations that eigs is taking to converge versus the version in Matlab or in Julia by @andreasnoackjensen. If the increase in time is proportional to that, then the problem is probably not codegen, it is some screwup of the Arnoldi algorithm.
I wrote up this example in matlab, and it seems that matlab's eigs finishes in a handful of iterations (<10). There doesn't seem to be a way to get the number of iterations performed by eigs in matlab.
% constructs a sparse discrete lapacian (or –laplacian, actually) inside a cylinder
% Typically, call genmat with n = 400
function Ai = genmat(n)
% Define mesh for the cylinder
% Code adapted from lecture notes
Lx = 1;
Ly = 1;
Nx = n;
Ny = n;
x = linspace(-Lx,Lx,Nx+2);
x = x(2:end-1)'; % a column vector
y = linspace(-Ly,Ly,Ny+2);
y = y([2:end-1]'); % a row vector
r = sqrt(bsxfun(@plus, x.^2, y.^2)); % use broadcasting (.+) to make Nx x Ny matrix of radii
i = find(r < 1); % and get indices of points inside the cylinder
A = Laplacian(Nx,Ny,2*Lx,2*Ly);
Ai = A(i,i); % to make a submatrix for the Laplacian inside the cylinder
end
% make the discrete -Laplacian in 2d, with Dirichlet boundaries
function x = Laplacian(Nx, Ny, Lx, Ly)
dx = Lx / (Nx+1);
dy = Ly / (Ny+1);
Dx = sdiff1(Nx) / dx;
Dy = sdiff1(Ny) / dy;
Ax = Dx' * Dx;
Ay = Dy' * Dy;
x = kron(speye(Ny), Ax) + kron(Ay, speye(Nx));
end
% construct the (M+1)xM matrix D, not including the 1/dx factor
function x = diff1(M)
x = [ [1.0 zeros(1,M-1)]; diag(ones(M-1,1),1) - eye(M) ];
end
% sparse version (the lazy way)
function x = sdiff1(M)
x = sparse(diff1(M));
end
@andreasnoackjensen Your code does not always seem to work with @stevengj 's problem, but when it does work, it seems to come back in 4 iterations for Nx=Ny=100 and 11 iterations for Nx=Ny=400. So, clearly something's up with our ARPACK implementation.
can you tell matlab to run just one iteration at a time?
In matlab, you can pass a function (that takes a vector and returns A*vector) instead of a matrix. Then just include a print statement in your function.
While @alanedelman and I were exploring this, we took a look at the Golub-Kahan-Lanczos algorithm for the svd, which is about as simple as @andreasnoackjensen 's Arnoldi implementation. It may be best for us to have our own implementations of eigs
and svds
- even if we do the implicitly restarting versions later.
This is @alanedelman 's implementation of svds
:
(m,n)=size(A)
αs=βs=zeros(0)
v=randn(n);v/=norm(v)
u=A*v
for k=1:100
α=norm(u);αs=[αs; α];u/=α
v=A'*u-α*v
β=norm(v);βs=[βs; β];v/=β
u=A*v-β*u
end
println(round(svdvals(Bidiagonal(αs,βs[1:end-1],true))[1:5]',3))
Some notes on the above:
This is the basic, no bells and whistles Golub-Kahan-Lanczos. We certainly should remove the magic numbers 100 and 5, and incorporate implicit restarting and perhaps shift and invert strategies .
The algorithm, is mathematically the same as Householder reduction when the starting vector is eye(n,1) or something, but is suitable for matrices that are not dense.
I think the key point here is that ARPACK has served well for many years, but if we have a Julia implementation, we can probably look at these methods less as a block box not to be touched, and more like an organic algorithm to be improved upon as we go.
The same would apply for Lanczos tridiagonalization of course.
Perhaps we should start an Arnoldi.jl package, which can move into base when it becomes ready. Seems like we have enough momentum.
Maybe should be called Lanczos.jl :-) or ArnoldiLanczos.jl
On Mon, Oct 28, 2013 at 7:48 AM, Viral B. Shah notifications@github.comwrote:
Perhaps we should start an Arnoldi.jl package, which can move into base when it becomes ready. Seems like we have enough momentum.
— Reply to this email directly or view it on GitHubhttps://github.com/JuliaLang/julia/issues/4474#issuecomment-27205093 .
I would like to contribute to such a package.
+1
I just created this and you guys should all have commit access to it.
I think IterativeSolvers.jl
would be a much more descriptive and inclusive term. Arnoldi and Lanczos are but two common examples of a much broader family.
Unless anyone has better references, I would recommend van der Vorst's Iterative Krylov Methods for Large Linear Systems, and the last couple of chapters in Trefethen and Bau.
That does seem like a much better generic name.
Done. https://github.com/JuliaLang/IterativeSolvers.jl
We could start off with @andreasnoackjensen 's gist above, @alanedelman 's code for GKL, and other iterative solvers that people started putting together. Perhaps future discussion can be had over in the issues for IterativeSolvers.
Well, the Templates books (Linear Systems and Eigenproblems) are also decent references. Although they don't include Sleijpen's BiCGSTAB(L) algorithm, which is one of the better ones for large sparse nonsymmetric systems (don't waste your time with any of the other BiCG variants, this subsumes them).
Apparently MATLAB's eigs
uses shift-and-invert if 'sm'
is specified while we also have to have sigma!=0
to use it. Thus the comparison is not entirely fair (for us). With 35690fd and shift-and-invert I get
julia> @time (d, v, nconv, niter, nmult, resid) = eigs(Ai,nev=1,which="LM",sigma=0.1)
elapsed time: 4.506958406 seconds (192699384 bytes allocated)
<snip output of eigs>
julia> d
1-element Array{Float64,1}:
5.76436
julia> niter
1
julia> nmult
20
which is much better. Without the patch the matrix is always converted to a dense matrix when calling lufact
(in shift-and-invert mode), which is expensive in this example. Without shift-and-invert, I get
@time (d, v, nconv, niter, nmult, resid) = eigs(Ai,nev=1,which="SM",sigma=0)
elapsed time: 330.53247192 seconds (17535572788 bytes allocated)
<snip output of eigs>
julia> d
1-element Array{Float64,1}:
5.76436
julia> niter
863
julia> nmult
8640
It seems like we should use shift-and-invert for sigma != 0
and inverse iteration (i.e. shift-and-invert without the shift) for which="sm"
.
@acroy Thanks for tracking this down. I kept thinking that we have some bug in our ARPACK interface, but this makes sense now.
Can someone submit a PR? Unfortunately I am quite taken up for a few weeks and don't want to do this in haste.
Do we imagine merging the PR for this for 0.3?
Yes. There is already an open PR, and should be possible to merge for 0.3
Closed by #6053, with continuing work in IterativeSolvers.jl.
The code below, which constructs a sparse discrete lapacian (or –laplacian, actually) inside a cylinder and evaluates the smallest eigenvalue using
eigs
, dies with anARPACKException
for me:Changing to
Nx=Ny=100
works fine. However,Nx=Ny=400
is "only" a 100000x100000 (positive-definite real-symmetric) sparse matrix from a 2d grid (hence the sparse-direct solver should be efficient), and similar code works fine in Matlab.What does this error mean? cc: @ViralBShah
[ViralBShah: The title was ARPACKException needs better error message, which is fixed.]