JuliaInv / MUMPSjInv.jl

MUMPS interface for Julia
17 stars 13 forks source link

Using Homebrew MUMPS #2

Open dpo opened 10 years ago

dpo commented 10 years ago

I tried building MUMPS.jl against the shared MUMPS library built via Homebrew. An advantage is that we can call brew to discover the location of the library, so there's no need to package the MUMPS source code. Here's a sample Makefile:

prefix = $(shell brew --prefix mumps)
libdir = $(prefix)/lib
incdir = $(prefix)/include

SRC = mumps_cmplx_p.f90  mumps_p.f90 call_mumps_cmplx_p.f90 call_mumps_p.f90
LIB = -L$(libdir) -ldmumps -lzmumps -lmumps_common -lpord

%.o: %.f90
    gfortran -I$(incdir) -fPIC -fcray-pointer -c $<

all: ${SRC:.f90=.o}
    libtool -dynamic -undefined dynamic_lookup ${SRC:.f90=.o} -o ../lib/MUMPS $(LIB)

clean:
    rm ../lib/MUMPS

This Makefile assumes that the parallel MUMPS library was built. If the sequential library was built (with brew install mumps --without-mpi), incdir becomes $(prefix)/libexec/include and we must add -lmpiseq to LIB.

This compiles but it seems something needs to be updated in the interface:

$ julia ../tests/testTwoSystem.jl
Factorize complex matrix
WARNING: convert{T}(p::Type{Ptr{T}},a::Array) is deprecated, use convert(p,pointer(a)) instead.

(sorry I'm too new to Julia to know what that message means). I'm also getting a bunch of messages of the form:

 ** On entry to ZGEMM  parameter number  4 had an illegal value
tkelman commented 10 years ago

Parameter number 4 to ZGEMM should be N, so this is probably the same issue as https://github.com/JuliaOpt/Ipopt.jl/issues/1#issuecomment-37556837 - Mumps typically uses an LP64 blas, but Julia uses ILP64 OpenBlas by default. It looks like Mumps is being set up in this package to link to a shared-library blas, so zgemm_ is shadowed.

lruthotto commented 10 years ago

Thanks @dpo for pointing into this very interesting direction. I agree that installing MUMPS using homebrew would be a great improvement! Not only more elegant but also giving access to MUMPS' parallel version.

The warning you mentioned seems to be independent from the MUMPS installation. At least I got the same warning, when running the tests on my machine using the current version of Julia. Unfortunately, I do not see either where it comes from.

Did the comment @tkelman made help you? I cannot try it on my machine yet as I have trouble installing MUMPS using homebrew.

dpo commented 10 years ago

I haven't had a chance to play with this. You might want to post the error message in the Julia user's forum to find out what it means. Please open an issue on homebrew/science if you have trouble installing Mumps.

lruthotto commented 10 years ago

Hi Dominique,

sorry that it took me that long, but I now managed to install mumps using brew (seems like missing Xcode command line tools have caused my problems). I tried to use your makefile, but it failed with many errors. Here is the start:

gfortran -I/usr/local/opt/mumps/include -fPIC -fcray-pointer -c mumps_cmplx_p.f90
mumps_cmplx_p.f90:24.23:

TYPE (ZMUMPSbrew_STRUC),intent(inout):: mumps_par
                       1
Error: Derived type 'zmumpsbrew_struc' at (1) is being used before it is defined
mumps_cmplx_p.f90:28.13:

   mumps_par%SYM = sym  ! =0 unsymmetric, =1 symm. pos def, =2 general symm.
             1
Error: Symbol 'mumps_par' at (1) has no IMPLICIT type
mumps_cmplx_p.f90:29.13:

   mumps_par%PAR = 1
             1
Error: Symbol 'mumps_par' at (1) has no IMPLICIT type
mumps_cmplx_p.f90:31.13:

   mumps_par%JOB = -1  ! initialize

I'm leaving out some more errors of that type... here is how it ends:


Error: Symbol 'mumps_par' at (1) has no IMPLICIT type
mumps_cmplx_p.f90:144.13:

   mumps_par%JOB = 2  !  factorization
             1
Error: Symbol 'mumps_par' at (1) has no IMPLICIT type
Fatal Error: Error count reached limit of 25.
make: *** [mumps_cmplx_p.o] Error 1

@dpo : Do you have a clue what's going on? Is there anything I need to modify in the Fortran code?

Cheers, Lars

dpo commented 10 years ago

Hi Lars. Sorry for the slow reply. I just pulled the git repo and it builds just fine against my homebrew Mumps with this Makefile (which is similar to the one I posted above). Does it not build for you?

The tests don't run for me though because it seems MPI isn't initialized. I quickly hacked mumps_p.f90 (here) but there's an error related to the LAPACK function DSYR:

 ** On entry to DSYR   parameter number  7 had an illegal value

You can probably tell more quickly than I what the problem is.

lruthotto commented 10 years ago

Hey Dominique.

I did some extensive testing and asked around, but could not come to a solution. First of all, your makefile now works for me and I could test mumps with and without MPI as well as with openblas and with my Mac's blas.

Unfortunately, I could not reproduce your error message. Instead I am getting Segmentation faults!

This one when I link against brew's MUMPS without MPI support (for both BLAS versions):

include("testMUMPS.jl")
test MUMPS
Test for real SPD matrix: one rhs

signal (11): Segmentation fault: 11
??? at ???:110110743
Segmentation fault: 11

And this one when I link against brew's MUMPS with MPI (for both BLAS versions):

include("testDivGrad.jl")
Test for real SPD matrix: one rhs

signal (11): Segmentation fault: 11
??? at ???:110110743
[dhcp-128-189-89-68:81342] *** Process received signal ***
[dhcp-128-189-89-68:81342] Signal: Segmentation fault: 11 (11)
[dhcp-128-189-89-68:81342] Signal code:  (0)
[dhcp-128-189-89-68:81342] Failing at address: 0x0
[dhcp-128-189-89-68:81342] [ 0] 0   libsystem_c.dylib                   0x00007fff8723b90a _sigtramp + 26
[dhcp-128-189-89-68:81342] [ 1] 0   libdmumps.dylib                     0x000000011ad5a9b7 __dmumps_load_MOD_dmumps_467 + 82
[dhcp-128-189-89-68:81342] [ 2] 0   libopenblas.dylib                   0x000000010580e9ce dsyr_ + 302
[dhcp-128-189-89-68:81342] [ 3] 0   libdmumps.dylib                     0x000000011ad13cb0 dmumps_230_ + 142
[dhcp-128-189-89-68:81342] *** End of error message ***

signal (11): Segmentation fault: 11
??? at ???:110110743
Segmentation fault: 11

It also points to the DSYR function. People here have the feeling it might be some issue with linking BLAS dynamically (Julia may link to a different BLAS as well). But this is just a guess.

At this point we have no idea how to fix it. Sorry!

dpo commented 10 years ago

Hi Lars. I forked your repository yesterday and worked on it for a while. I found a way to refer to the library in a generic way, provided it's in the user's LD_LIBRARY_PATH. But I'm getting the same segmentation fault as you. Mumps and Julia use different BLAS libraries, but I'm not sure why that would be a problem. I'll contact some Julia folks to see if they have any idea.

tkelman commented 10 years ago

Is Homebrew OpenBLAS built with 32 bit or 64 bit integers?

dpo commented 10 years ago

Mine is build with 64bit integers. But how does that matter? Mumps uses its own BLAS.

tkelman commented 10 years ago

Right, and what does it expect as the integer size there? Unless it's linking to a static-library BLAS, the integer sizes will all have to be the same. Shared libraries will shadow each other. We have this problem all the time with Ipopt.

dpo commented 10 years ago

Aha. The BLAS used by MUMPS is the default brew OpenBLAS, so I presume it uses 32bit integers. In my fork, I carefully changed all integer data to 32bit because MUMPS uses default FORTRAN INTEGERs, which are 32bit. Are you saying MUMPS should link the Julia's OpenBLAS?

tkelman commented 10 years ago

Either Mumps will need to link a static-library Blas that uses 32-bit integers, or you'll need to configure Mumps to use 64-bit integers in all Blas calls (this might be doable with -fdefault-integer=8 or some such, but I've never tried it)

Or, you can tell Julia and Blas and Mumps to use 32-bit integers everywhere (USE_BLAS64=0 I think it is), but the default binaries are not built this way AFAIK.

dpo commented 10 years ago

I see. In the second scenario, what if a user built Julia with a 32bit OpenBLAS?

tkelman commented 10 years ago

Sorry yeah edited in a 3rd option - you'll need Julia and Mumps to be consistent in the integer size they expect from Blas, unless you link statically.

dpo commented 10 years ago

Ok, will give that a try. Unless there's a plan to include MUMPS in Julia at some point, I think we'll have to go with the static linking. I'm still not sure I totally understand the issue because MUMPS's BLAS lib is libopenblas while Julia's lib is libopenblas64. How can there be confusion?

tkelman commented 10 years ago

Oh do you have openblas building under a different library name when it uses 64 bit integers? That's probably a good idea. I think the confusion and cross-talk is from having multiple libraries dynamically loaded that have the same function names, though I'm not an expert in how dynamic linking works I just have seen the consequences and ways that I've been able to fix it.

We did static linking (to netlib blas, but a better static blas should also work) for the purposes of Mumps within the Ipopt.jl package, see my original link from April.

I think the eventual best thing to do here is prefix all the function names for Julia's 64-bit-integer openblas to avoid this problem. That's the plan in the cross-referenced Julia issue 4923, it just hasn't been done yet.

dpo commented 10 years ago

That's tricky. Thanks a lot for the tip! That sounds like the likely culprit.

dpo commented 10 years ago

Well I just rebuilt MUMPS against a static 32bit OpenBLAS but I'm still getting the same segfault. Now MUMPS depends on Scalapack, which also links to its own BLAS (dylib for now). It's not realistic to require that everybody be linked statically.

tkelman commented 10 years ago

Perhaps not. Until someone figures out the proper way to use objcopy as part of Julia's build system and writes the requisite macros around all blas ccalls to only add the prefixes when necessary, the package will only be usable when Julia is configured for 32-bit-integer blas.

dpo commented 10 years ago

Confirm: The MUMPS interface works with Julia32. Thanks @tkelman!

dpo commented 10 years ago

There's still a segfault when I try to solve two systems in a row. Somehow memory isn't cleared up properly and/or MUMPS isn't reinitialized properly.

tkelman commented 10 years ago

@dpo if it works the first time but not the second, I suspect that's a bug in memory handling as you said, and probably has more to do with the package binding code here in MUMPS.jl than anything about the Julia build configuration.

dpo commented 10 years ago

Well I was not able to make this work, so I rewrote my own MUMPS interface (in C this time!) See https://github.com/dpo/MUMPS.jl. Deallocating / reallocating works fine (at least for me). It's real-only at the moment, but adding complex arithmetic is simple. The C interface also seems far simpler and easier to maintain.

lruthotto commented 10 years ago

thanks! this looks really interesting. looking forward to testing it soon!

dpo commented 10 years ago

ps: the new interface also runs in parallel!

lruthotto commented 10 years ago

hi dominique. I like your interface a lot better than our fortran interface (especially the access to all control parameters). I also agree it seems simpler to maintain. However, did I understand it correctly, that is still is limited to Julia's 32 bit version? For our applications that would not be feasible. Are there new ideas how to combine Julia 64 bit and MUMPS?

dpo commented 10 years ago

Hi Lars. The 32 bit thing is only because my BLAS is 32 bit. If you can manage to build Julia against a 64 bit BLAS and not have a 32 bit BLAS shadow the functions of the first, you should be good to go. However, MUMPS uses default Fortran INTEGERs, which are 32 bit unless you compile with -fdefault-integer-8. In the latter case, the MUMPS/Julia interface will require minor tweaks. I'll put that on the list.

tkelman commented 10 years ago

If you can manage to build Julia against a 64 bit BLAS and not have a 32 bit BLAS shadow the functions of the first

@lruthotto, @dpo, this is being tracked at https://github.com/JuliaLang/julia/issues/4923. If you could speak up there that this is causing problems in packages like this one, and more importantly if you have any suggestions that could work on Linux and Mac (I already have a solution that should work for Windows), please let us know.