JuliaLinearAlgebra / IterativeSolvers.jl

Iterative algorithms for solving linear systems, eigensystems, and singular value problems
MIT License
408 stars 106 forks source link

Stack overflow in cg #11

Closed lruthotto closed 10 years ago

lruthotto commented 10 years ago

Just tried out the test script for CG. It returns a stack overflow.

ERROR: stack overflow

 in KrylovSubspace at /Users/larsruthotto/Dropbox/Julia/Modules/IterativeSolvers/src/krylov.jl:25 (repeats 80000 times)
while loading /Users/larsruthotto/Dropbox/Julia/Modules/IterativeSolvers/test/cg.jl, in expression starting on line 8

I guess this is associated to the KrylovSubspace type. Since we do not understand types well enough to see benefits, me and @ehaber99 created a new package called SimpleIterativeSolvers (https://github.com/lruthotto/SimpleIterativeSolvers.jl) in which we so far managed to avoid creating our own types.

ViralBShah commented 10 years ago

Simple way to reproduce:

julia> cg(eye(10), ones(10))
ERROR: stack overflow
 in KrylovSubspace at /Users/viral/repos/IterativeSolvers.jl/src/krylov.jl:25 (repeats 80000 times)
Keno commented 10 years ago

Hi @lruthotto and @ehaber99,

sorry to hear you're having trouble. As you know this package is just getting started and there's obviously some pains associated with that. In particular we're missing travis integration and it doesn't seem like we've been doing a good job keeping the tests up to data. Sorry about that! Nevertheless I would encourage you to help with the development of this package rather than creating your own that provides equivalent functionality. The more smart people we can get on the problem the better! I would be more than happy to clear up any confusion you may have and I'm sure @jiahao would be too.

ViralBShah commented 10 years ago

I just opened an issue for travis at the same time. I would love to make this package a bit more stable so that we can get all contributions in one place.

Keno commented 10 years ago

Reopening this issue as there's still problems with the constructor definition. @jiahao could you take a look at what was intended?

jiahao commented 10 years ago

Yeah I noticed the problem but hadn't pushed the change yet. There's definitely something wrong with the constructor.

ViralBShah commented 10 years ago

I think it should be:

KrylovSubspace{T}(A, n::Int, order::Int, v::Vector{Vector{T}}=Vector{T}[])=
    KrylovSubspace{T}(A, n, order, v)
ViralBShah commented 10 years ago

@jiahao How do I run the tests?

Virals-MacBook-Pro 10:41:36 {master} ~/repos/IterativeSolvers.jl/src$ ~/julia/julia ../test/test.jl
ERROR: ev_power not defined
 in anonymous at no file:16
 in include at boot.jl:238
 in include_from_node1 at loading.jl:114
 in process_options at client.jl:311
 in _start at client.jl:399
while loading /Users/viral/repos/IterativeSolvers.jl/test/test.jl, in expression starting on line 6
Keno commented 10 years ago

You'll want this diff: https://gist.github.com/loladiro/7905395

jiahao commented 10 years ago

Man you guys are suddenly so interested in this package. I wonder why ;)

jiahao commented 10 years ago

I think I've now peeled off all my local commits that I want to push upstream ATM. Oddly enough, they seem to address a few of these new issues.

ViralBShah commented 10 years ago

Thanks!

ViralBShah commented 10 years ago

@jiahao Could you explain the motivation behind the KrylovSubspace type, which was actually part of the original issue. Perhaps we could capture this in the README as well.

lruthotto commented 10 years ago

Wow I didn't expect such a big resonance. To say the most important thing first: Of course, we don't want to have two packages with identical functionality in the end. We really enjoy playing with Julia most importantly because of the active community and we will be happy to help where we can (maybe this thread was a rough but good start)!!!

I agree with @loladiro : The more people help the better the package will be. However, I think that this is an argument for code that is readable and easy to understand for people that are not too familiar with Julia. Therefore, I think it would be helpful to be very conservative with using types unless it is shown that they increase performance. To test this, was a big motivation to create our small project and it seems like the competition already helped.

I will try out the new fixes tomorrow and submit a few test cases that can be run to check the status of the package.

JasonPries commented 10 years ago

I would have to agree with @lruthotto and @ehaber99 that the greek letters should be avoided. It seems likely only to cause trouble somewhere down the road.

jiahao commented 10 years ago

Maybe it was just unfortunate timing all round, but I have been tinkering with the contributed cg and gmres routines since they were submitted last week. It was necessary to provide a unified naming scheme and interface where possible across solvers. Unfortunately I broke something while refactoring the code and hadn't gotten round to fixing it up time since I'm on the road.

I just finished up some sketch of the documentation in a commit yesterday where I explained my rationale for introducing KrylovSubspace and the like. KrylovSubspaces expose a common logical structure to a large class of iterative solvers and allow for code reuse of common features like abstracting out matrix-vector products and generically useable orthogonalization routines. If properly implemented (and it's quite possible that it isn't right now), this shouldn't compromise on speed at all. At the same time I can understand why it is confusing, since almost all the references out there make absolutely no reference to the underlying mathematical structure and it's all hidden behind a soup of symbols.

As for the use of Greek letters in the notation, there is absolutely no performance hit or syntactic issue with using them in Julia, and I would argue that if it helps in making the comparison with published code listings, then it's better to have them. Of course, there is absolutely no consensus on notation across references, and there could be fruitful discussion on what common notation can be adopted across different solvers.

jiahao commented 10 years ago

Also, I've been rewriting solvers to return ConvegenceHistory types to encapsulate information about the iteration history in the solver routines in a common format. It's bad practice to print output in a computational routine, but at the same time it's often useful to have such information to understand if the computation went astray. It's far preferable to send this information back to the calling routine where user functions can decide what to do with it. If you want to print it, that's fine, but it's not so great to force these solvers to always print stuff. We can continue the discussion in #6 over there if the list of returned information is insufficient.

JasonPries commented 10 years ago

I take your point, but their use raises barriers to entry and discourages potential contributors. For example, "that code has greek letters, I'm not going to touch it because (I'm likely to mess something up, it's a big hassle, et cetera)." I would argue it is better to have a wider contributor base than a marginal increase in comprehension.

jiahao commented 10 years ago

I would argue it is better to have a wider contributor base than a marginal increase in comprehension.

I really, really don't understand why we can't have both. How hard it is to accept that Julia allows code to be written in a way that makes it easy to compare with published code listings? It's really not merely a "marginal" benefit when the notation makes it that much easier to spot typos.

StefanKarpinski commented 10 years ago

I take your point, but their use raises barriers to entry and discourages potential contributors. For example, "that code has greek letters, I'm not going to touch it because (I'm likely to mess something up, it's a big hassle, et cetera)." I would argue it is better to have a wider contributor base than a marginal increase in comprehension.

I have to say that I don't think that most people who look at code like this will be dissuaded from tinkering with it because of the greek letters in the code – it's all the hard math that's intimidating, and honestly, it kind of should be a bit. But greek letters should be least intimidating to someone with the math background to mess with this, so I'm not sure there's really a problem with using them.

I was a bit hesitant about using Unicode for things like this, but I've really warmed to it. The most common thing to do is to compare the code to the papers it came from and using identical names makes that so easy.

lruthotto commented 10 years ago

Thanks @jiahao for the documentation and explanation about the types. As far as the output type is concerned I am with you that typing makes sense, although different solvers will in general have different outputs (like CLGS where intermediate iterates may be useful to store).

I understand the KrylovSubspace type a bit better now, but I must admit I am still not convinced. First, as already mentioned it adds complexity to the code (and hides the most interesting/important parts such as matrix vector products and orthogonalization) and thus is a barrier - at least for myself - to touch it and fix a bug. Second, it seems to be hard work implementing this type properly at least much harder than it usually is to implement a solver that does the job. Third, I don't see big benefits for the code structure. Finally as far as I understood it does not increase performance (I'll have to test this though).

Concerning the Greek letters, this discussion seems to become quite religious and there might be better usages of our time (for instance creating fast iterative solvers). Since I don't know how to make them on my keyboard and I don't want to worry about file formats, I don't see me using them excessively.

ghost commented 10 years ago

I was away yesterday and missed some of the fun. Let me add a two comments

  1. Types - there is a risk with any new and beautiful language (like Julia) to fall in love with the language and to do things not because they are needed but because they "look good". I believe that the Krylov type is such a case. There is no reason that I can see in the Krylov type. Iterative methods are simple and easy to program. The fact that using the types we managed to get cg to not work (the base is 7 lines of code?) just shows how wrong is using types here. Code simplicity and making people understand it is much more important than falling in love with the language.
  2. Greek letters - I really do not care. I do not know how to reproduce them on my mac but if this makes anyone feel good then go ahead and change alpha into \alpha.
StefanKarpinski commented 10 years ago

The Mac has great support for this. There's usually a "special characters" menu item under Edit that will pop up a dialog and also show you an application-specific shortcut to get the same (I wish it were universal but haven't figured that out yet). You can also switch input keyboards pretty easily, including to a Greek one.

Keno commented 10 years ago

There is an important point in here, which is that designing the right abstractions in incredibly hard - indeed I'd like to believe that designing abstractions is the single most difficult task in programming. The point is though that Julia, unlike many other technical computing languages allows you to define these abstractions without much hassle and without any performance penalty. Introducing types is really never meant for performance improvement as you can easily just do without. The crux lies in the abstraction and the way you think about the algorithm. If the Krylov subspace provides a useful abstraction over the differences between solvers and allows the different solvers to be written in a more uniform style, I'm all for it. That being said, it's important not to take this too far. Defining a type just to encapsulate say the return type of a single function is unnecessary overkill.

Thank you for the nice words regarding the Julia community. I think we're very lucky to be in an environment like this and I think we can get great things done this way.

Regarding the greek letter debate, I don't really want to get into it. I don't see it as much of a problem because pretty much all modern editors have appropriate editor modes.

jiahao commented 10 years ago

The fact that using the types we managed to get cg to not work (the base is 7 lines of code?) just shows how wrong is using types here.

Yes, it was my fault that I mistyped something that caused cg to break. But to claim that using types is wrong because of that is a logical fallacy. A typo can happen anywhere; I'll simply have to be more careful going forward now that people actually seem genuinely interested in using the package.

adds complexity to the code (and hides the most interesting/important parts such as matrix vector products and orthogonalization) and thus is a barrier - at least for myself - to touch it and fix a bug.

Conjugate gradients is by far the simplest algorithm in this class of Krylov methods. That an implementation can be put together without the need for special bells and whistles is testament to the power of the method. I genuinely think the abstraction has value going forward for more elaborate algorithms. Encapsulating matrix-vector products and orthogonalization enables reuse: every algorithm in this class is going have to do the same things like check if A is a matrix or a vector, and dispatch a function call or matvec product appropriately. Likewise for orthogonalization.

Second, it seems to be hard work implementing this type properly at least much harder than it usually is to implement a solver that does the job.

As a computational scientist, I understand the desire to simply get things done. However, library authors simply have to be held to higher standards; the code should work correctly in a way that is consistent, reusable and extensible. The problem is that we don't have a standard notation, interface and implementation for iterative algorithms the same way that dense matrix routines do. We can discuss how all of that should be done, and I'll be more than happy to have those discussions. At the same time, continuing to type more algorithms out from textbooks and papers is not sustainable in the long run without due attention to implementation details.

ViralBShah commented 10 years ago

When I first saw the KrylovSubspace type, my first reaction was also that we do not really need it, based on my past experiences. However, as I thought a little bit more, I felt that we at least owe it to ourselves to try and find better abstractions, and this clearly was worth a try. We went through this same process with the direct solvers in Base, where we had them matlab style at first, and then realized that having factorization types would greatly help with code simplicity and performance.

Looking at the code in krylov.jl, it seems that there is a fair amount of reusability possible across different iterative methods. Even though I am not yet fully convinced, I am willing to go a bit further before saying it is overkill. I would like to see how preconditioning fits into all of this as well, and make it easy for users to provide their own preconditioners with little fuss.

Clearly, there is value in having a type to capture all the various outputs, as returning a large tuple is just not fun to use.

Nothing is set in stone, and code will keep evolving. We should be able to revisit this once we have a few more methods implemented.

Any piece of code may stop working if there are typos or bugs. IMO, that is no reason to use or not use types. We now have travis integration where tests are run on every commit and PR, which means we are in a much better place than we were two days back.

ViralBShah commented 10 years ago

Oh, on greek letters, I have stopped discussing. I personally do not like to deal with editor modes and keystrokes, but if someone is using them, I just copy paste stuff. After all, the code certainly is easier to read, just a bit of a pain to edit.