RFC: Naming of functions in an Lapack module

dmbates commented 12 years ago

@ViralBShah has suggested that all the code that ends up calling

ccall(dlsym(_jl_liblpack, ...) ...)

which includes much of the current base/linalg_lapack.jl and base/linalg_factorizations.jl be moved to an Lapack module. See the discussion of pull request #1243.

After doing so the _jl_lapack prefixes on the names of functions could be removed. I can start work on this but I wanted to request comments on some function names first:

Remove all _jl_lapack prefixes
Perhaps leave all symbol names unexported. This way these functions will only be called by a user sufficiently knowledgeable to reference an unexported symbol name.
Add ! to the names of those functions that modify an input array.

I would also suggest continuing the convention of passing StridedMatrix or VectorOrMatrix templated types and not passing arguments like lda, m, n, etc. that can be derived from the other arguments. (The underlying Lapack routines just get a pointer to the contents of the vector or matrix and must have dimensions, leading dimensions, etc. passed separately.)

I don't know how much cost there is in adding many small functions to the namespace of a module. If this cost is negligible then I would suggest creating a more-or-less complete set of interfaces to the Lapack subroutines. For example, in addition to the QR decompositions allow for the LQ, RQ and QL decompositions.

Finally, @timholy has created a Tridiagonal type in pull request #1243. We may want to consider other specialized types of matrix structures reflecting the specialized Lapack types, such as Triangular, Symmetric, GeneralBanded, Hessenberg, Bidiagonal, Schur, etc.

timholy commented 12 years ago

I'm fine with the idea of a module in principle. To me, matrix algebra is part of the core of Julia, and I would not want to have to type

julia> load("lapack.jl")

julia> import LapackMod.*

at the beginning of every Julia session (unless there is some significant cost to Julia's potential for mainstream programming by having Lapack available at startup). @ViralBShah , care to comment further on the reasons for a separate module?

Re the naming, those plans seem good. I think we could do most of that independently of the idea for a separate module, since Base is already module and those are not exported.

While working on the last set of changes, I came to the conclusion that the low-level Lapack wrappers should be moved from factorizations.jl to linalg_lapack.jl. There are already wrappers for some Lapack factorization routines in linalg_lapack.jl, this would just make that file the go-to place for all Lapack wrappers.

ViralBShah commented 12 years ago

That is right. I am suggesting to put the LAPACK wrappers in a module to start with.

Linear Algebra could be in a module too - but I agree that it would be really annoying to load a module to do lu(), and it should always be loaded by default. If such a module is going to be loaded by default, then it might as well just be in Base.

-viral

On 03-Sep-2012, at 8:55 PM, dmbates wrote:

@ViralBShah has suggested that all the code that ends up calling

ccall(dlsym(_jl_liblpack, ...) ...) which includes much of the current base/linalg_lapack.jl and base/linalg_factorizations.jl be moved to an Lapack module. See the discussion of pull req[https://github.com/JuliaLang/julia/pull/1243]

— Reply to this email directly or view it on GitHub.

dmbates commented 12 years ago

@timholy The idea is that the user would not typically access functions in the Lapack or BLAS modules. The linear algebra functions would stay essentially as they are - it is just that behind the scenes they would call, e.g.

Lapack.getrf!(...)

instead of the current _jl_lapack_getrf

I agree about the low-level Lapack wrappers in factorizations.jl being moved to linalg_lapack.jl. That was part of the plan although I might now have expressed that well.

StefanKarpinski commented 12 years ago

I wholeheartedly support this plan. The non-surface BLAS and LAPACK functions should all be squirreled away in modules for sure.

dmbates commented 12 years ago

During the implementation I encountered order-dependency problems and ended up modifying the functions to interface with Lapack tridiagonal subroutines. I wanted to assure @timholy that it was not just on a whim that I changed his interfaces.

I have added a file Lapack.jl defining a module of the Lapack interface functions. Tim had Lapack interface functions defined in linalg_specialized.jl that had methods for Tridiagonal objects. If these are to be moved to the Lapack.jl file then in sysimg.jl it must follow the file containing the declaration of the Tridiagonal type or contain the declaration itself. That becomes awkward. I imposed the rule that direct calls to the Lapack interface functions use only scalars, vectors and matrices as arguments. When a specialized matrix type is defined, methods for eig, solve, \, lud, qrd, etc. access fields and pass them to the Lapack interface routines.

I also changed the declaration of info in Tim's code. IIRC, to allow the info value to be written by the Lapack subroutine it must be an Array(Int32, 1), not an Int32. If it is an Int32 and passed as &info the Lapack subroutine can change the value but the change will not be visible in the calling function.

StefanKarpinski commented 12 years ago

These sound like good things to me. Can we see these changes somewhere?

timholy commented 12 years ago

If it is an Int32 and passed as &info the Lapack subroutine can change the value but the change will not be visible in the calling function.

Didn't know that, thanks for explaining.

As for the other changes you describe, I agree that it's probably best to separate the Tridiagonal type from the most basic Lapack wrapper. Assuming you're fixing both the core functions and the higher-level wrappers, I'm all in favor. Thanks for the cleanup!

dmbates commented 12 years ago

@StefanKarpinski Changes will be committed to git@github.com:dmbates/julia in a LapackModule branch once I get them building again. As you know it gets tricky creating modules in files that are included in sysimg.jl

StefanKarpinski commented 12 years ago

Oh, yes I do. Good reason to move more things out of sysimg.jl in my opinion. Maybe we need to have a third bootstrapping stage. Viral's been struggling with the same sort of things for his refactoring of RNG stuff.

dmbates commented 12 years ago

@StefanKarpinski The LapackModule branch is now on my fork of the julia repository. Still much cleaning up to do.

dmbates commented 12 years ago

A further clean up, though perhaps not today, will be to replace all the two-state flag arguments to Lapack.* (like diag, uplo, ...) that currently take LapackChar arguments with Bool arguments so we don't need to check for legitimate values.

Longer term: Should wrappers for BLAS be in a separate module or should these modules be combined?

StefanKarpinski commented 12 years ago

Is there a clear logical distinction between what would go into the BLAS module versus the LAPACK module? E.g. are the calling conventions significantly different? I know there's a clear historical distinction, but that doesn't necessarily mean there's also a clear logical distinction. OpenBLAS, for example, includes LAPACK functionality from what I can tell.

dmbates commented 12 years ago

There is no clear distinction between BLAS and Lapack subroutines. The naming conventions are the same but the names themselves don't overlap.

I don't think there would be a problem in putting both sets of interfaces in one module.

StefanKarpinski commented 12 years ago

Sounds good. Should probably be called LAPACK then and include BLAS routines as a lower level.

dmbates commented 12 years ago

I have found that Jeff's allowing a pointer argument in ccall to be constructed as &foo is very helpful in calling Fortran subroutines. I have gone even further and condensed sequences like

   ldb = stride(B,2)
   ccall(dlsym(...), Void,
          (Ptr{Int32}), &ldb)

to

   ccall(dlsym(...), Void,
         (Ptr{Int32}), &stride(B,2))

and that does seem to work. (Clever fellow that @JeffBezanson !)

I just wanted to check that I should be able to expect it to continue to work as desired.

StefanKarpinski commented 12 years ago

It will continue to work. If anything, it might become more first-class, allowing bound values to be written via ccall. Of course, that won't pass them out of the function scope, but it could be used locally. Do any of the LAPACK calls actually write to the integers passed in that way? I thought it was because Fortran arguments had to be vectors/pointers or something like that. If the LAPACK calls do actually return a value via such arguments, we might want to consider translating that idiom into multiple return values, which is much more Julian.

dmbates commented 12 years ago

Yes, the Lapack calls do write to an integer passed by reference. More Lapack calls have an info argument to return an error code (not sure why they didn't make all the subroutines into integer functions but they didn't). This was the issue that I mentioned to @timholy earlier - he was using sequences like

   info = zero(Int32)
   ccall(dlsym(...), Void, (..., Ptr{Int32}), ..., &info)
   if info != 0 ...

and at present I don't think that works as intended. We still need to use

   info = Array(Int32, 1)
   ccall(dlsym(...), Void, (..., Ptr{Int32}), ..., info)
   if info[1] != 0 ...

StefanKarpinski commented 12 years ago

Oh yeah, that won't work currently, but it probably ought to be made to work since it looks like it should work to anyone with a C programming background. That's definitely a bit of a trap. cc: @JeffBezanson

ViralBShah commented 12 years ago

I think it is ok to have BLAS and LAPACK functions in the same module, which could be called Lapack. The only small issue is the plugging in different BLAS libraries, but that is more of a build and install issue.

ViralBShah commented 12 years ago

@dmbates I have still not got the module stuff working even for rng. I have given up on sysimg for now, and am just trying to make it work standalone, and still with not much luck.

https://github.com/JuliaLang/julia/pull/1264

dmbates commented 12 years ago

@ViralBShah I do have the module stuff working with sysimg for Lapack in the dmbates/julia repository's branch LapackModule. It still needs cleanup. I have switched to the trailing 'd' on names for creating factorizations so chold creates a type that specializes the abstract Cholesky{T} type that specializes the abstract Factorization{T} type and chol provides the Matlab-compatible (and also R-compatible) function to return the upper triangular matrix, as discussed in an issue or commit that I can't find right now.

timholy commented 12 years ago

Wow, thanks for tackling this part, too!

The discussion you're referring to is for commit 826c42f4eca1

dmbates commented 11 years ago

@timholy Could you take a look at the definitions of gttrf! and gttrs! in the Lapack module (julia/base/Lapack.jl in the LapackModule branch of my fork of the julia sources). It seems to me from the documentation in the Fortran sources that one of the Vectors being passed as work is not a work array. The contents of du2 must be preserved between the factorization (gttrf!) and the solving of systems (gttrs!).

To do so requires adding a field to the LUTridiagonal type but I didn't want to start messing about with types you had defined without checking with you first. This may also be the cause of the problem discussed in https://groups.google.com/forum/?fromgroups=#!topic/julia-dev/j0r0LnQdCjQ

The Fortran sources are available in julia/deps/openblas-v0.2.3/lapack-3.4.1/SRC

dmbates commented 11 years ago

@ViralBShah I plan to use the Bunch-Kaufman factorization for solving Hermitian systems (Lapack routines are *sytrf, *sytrs and *sysv. At present there is a heuristic for determining whether or not to try for a Cholesky decomposition and, even then, it can fail, which causes a reversion to the LU decomposition. I think it is easiest just to check for a Hermitian matrix and use the Bunch-Kaufman. Do you know of a reason why this wouldn't be a good idea?

timholy commented 11 years ago

Correct, du2 needs to be preserved. In the version in the main Julia repository, I am pretty sure this is being done correctly: see gttrf and gttrs in linalg_lapack.jl, and their usage of dutmp. However, I realize I should explain a bit more about the thinking behind the design of Tridiagonal, so you can see how this works. It may also be sufficiently confusing that it merits cleanup.

In defining the Tridiagonal type, I created the dutmp array initially to serve as a work array during my own hand-written inversions (see solve in linalg_specialized.jl). These routines still have a place, because they can operate on strides different from 1. My intent in creating the work array as part of the type was to minimize the need to allocate each time you want to solve a tridiagonal equation. For the cases I'm using this for right now, I'm literally solving millions of tridiagonal equations (along each 1-d "pencil" of a m-by-n-by-p array, where m, n, and p are of order 1000) with the same tridiagonal matrix. Even if the size of each tridiagonal equation is such that solving the equation takes longer than memory allocation (and for small problems this is not guaranteed), eventually you'd hammer the garbage collector, for which I've seen very clear consequences on performance. So the thought was that since you know the size of the work array you'll need in advance, it's best to store it in the type itself so you can reuse it.

When I got to writing the Lapack wrappers, I noticed that LU decomposition required du2, the 2nd upper diagonal, because the LU decomposition of a tridiagonal matrix is not tridiagonal. However, I had this dutmp lying around---it was larger (by one element) than you needed, but that's not a big issue since Fortran doesn't know about the sizes of the arrays you pass to it. With the proposal to create an LUTridiagonal type, it seemed clear that it would be safe (at least for the moment) to re-purpose dutmp to store du2, because there was no method definition solve on a LUTridiagonal type (and if you did create one it would be different anyway). So I made LUTridiagonal simply store a Tridiagonal matrix, but one for which dutmp is not a work array, but a meaningful storage array. It was this array that was passed between gttrf and gttrs, and why the code in the main repository passes the various tests I added to test/lapack.jl.

However, from the standpoint of meaningful names, this shortcut may be confusing, and "sometimes it's a trash array and sometimes it's meaningful" also doesn't have the nicest sound to it. So there is an argument to write LUTridiagonal as a completely independent type for which du2 is a separate field, rather than taking the shortcut of making it a wrapper around a Tridiagonal.

This may also be the cause of the problem discussed in https://groups.google.com/forum/?fromgroups=#!topic/julia-dev/j0r0LnQdCjQ

No, that was a problem with the test, not the routine, and was simply that my CPU gives a numerically-exact answer in a situation where there seems to be another CPU that does not. Instead of writing the test as a==b I should have written abs(a-b) < Eps for some suitable Eps. I had already done that for all the other tests, but I somehow missed that one. Dumb mistake. But already fixed.

dmbates commented 11 years ago

Pull request #1281 contains changes for an Lapack module. This includes the changes of the names to generate a factorization so that chold generates an instance of the CholeskyDense type whereas chol returns an upper triangular Cholesky factor as in Matlab and R.

For solving Hermitian systems this code uses the Bunch-Kaufman factorization which applies to any Hermitian systems. The Cholesky factorization only applies to positive-definite Hermitian matrices.

I added a SymTridiagonal type in linalg_specialized.jl @timholy I kept the name ldltd for creating the LDLTTridiagonal type but I think the names of the generator function and the type should be changed to chold and CholeskyTriDiag to cut down on the number of generics in Base. The LDL' factorization is a type of Cholesky factorization. Also, do you think it is important to have the methods for solve in Base? I can appreciate that your particular application may benefit from pre-allocation of scratch storage and from the ability to solve systems in which the columns of the the right-hand-side are not contiguous but this results in a Tridiagonal type that contains more than the three diagonals and also introduces a solve generic that may lead to some confusion for recovering R users in that it solves one type of linear system but not any others. Would it be reasonable to move these out of Base and into a package for the sorts of problems you are addressing?

Could an issquare generic be added to linalg_dense.jl. An early entry in an obfuscating Julia contest could be the one-liner

issquare(A::Matrix) = ==(size(A)...)

ViralBShah commented 11 years ago

@dmbates Can we close this one?

ViralBShah commented 11 years ago

Before we close this one - a few issues:

There is some confusion in file names, since we now have lapack.jl and linalg_lapack.jl. Can we merge all functionality of linalg_*.jl files into linalg.jl?
linalg_blas.jl could be renamed to blas.jl, in line with lapack.jl. The module name could just be Blas instead of BLAS.
On lapack.jl:98, perhaps we can just use Base._jl_liblapack rather than Base.Base._jl_liblapack. Also, we should try remove as many of the _jl_ prefixes as possible.
For gebal and friends, the macros use strings rather than symbols - they use "dgebal_" rather than :dgebal_. Is there a reason for this, or should we change it to be consistent with the other macros in lapack.jl?

dmbates commented 11 years ago

@ViralBShah

Would it be better to keep that code for the module(s) in one file (lapackblas.jl) or two files (blas.jl and lapack.jl) and the linear algebra functions visible in Base in another file? (I think it is okay to keep two modules, Blas and Lapack for the moment but the issue may be worthwhile revisiting in the future.)
I see you have done that. Thanks.
Yes, there should be just one Blas.
The macros should use symbols. The strings are a holdover from earlier code.

dmbates commented 11 years ago

@ViralBShah In response to your first question, once the distribution of code to files is resolved this issue should be closed.

ViralBShah commented 11 years ago

We should have one file for lapack.jl, one for blas.jl, and one for all the linear algebra. Then, we should be able to close the issue.

-viral

On 24-Sep-2012, at 8:43 PM, dmbates wrote:

@ViralBShah

• Would it be better to keep that code for the module(s) in one file (lapackblas.jl) or two files (blas.jl and lapack.jl) and the linear algebra functions visible in Base in another file? (I think it is okay to keep two modules, Blas and Lapack for the moment but the issue may be worthwhile revisiting in the future.)

• I see you have done that. Thanks.

• Yes, there should be just one Blas.

• The macros should use symbols. The strings are a holdover from earlier code.

— Reply to this email directly or view it on GitHub.

timholy commented 11 years ago

Can this be closed?

JuliaLang / julia

RFC: Naming of functions in an Lapack module #1248