sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.4k stars 474 forks source link

cvxopt should not add -lcblas and -latlas on Darwin #12519

Closed jdemeyer closed 12 years ago

jdemeyer commented 12 years ago

The following is Sage-specific code in cvxopt's setup.py:

if os.popen('sage_fortran --version').read()[:3]=='G95':
    libraries = ['m','lapack','gsl','gslcblas','blas','f95']
    GCC_LIB_DIR=SAGE_LIB
    if os.path.exists(GCC_LIB_DIR + "/gcc-lib"):
       GCC_LIB_DIR += "/gcc-lib/"
       GCC_LIB_DIR += os.listdir(GCC_LIB_DIR)[0] + "/"
       GCC_LIB_DIR += os.listdir(GCC_LIB_DIR)[0] + "/"
    libdirs = [ATLAS_LIB_DIR,GCC_LIB_DIR]
elif os.environ['UNAME'] == 'CYGWIN':
    libraries = ['lapack','gsl','gslcblas','blas', 'gfortran']
else:
    libraries = ['m','lapack','gsl','blas','gslcblas','cblas','gfortran','atlas']

At the end, cblas and atlas are added, even though they aren't installed on Darwin. This causes a build failure for cvxopt on OS X 10.4 with #12369 (without #12369, the first "if" evaluates to True).

In this ticket, I also fixed the "spkg-check" script, which will now actually fail if there is a failure.

spkg: http://boxen.math.washington.edu/home/jdemeyer/spkg/cvxopt-1.1.3.p1.spkg

Component: packages: standard

Author: Jeroen Demeyer

Reviewer: Dmitrii Pasechnik

Merged: sage-5.0.beta8

Issue created by migration from https://trac.sagemath.org/ticket/12519

jdemeyer commented 12 years ago

Description changed:

--- 
+++ 
@@ -16,3 +16,7 @@

At the end, cblas is added, even though this isn't installed on Darwin. This gives problem on OS X 10.4 with #12369 (without #12369, the first "if" evaluates to True). + +In this ticket, I also fixed the "spkg-check" script, which will now actually fail if there is a failure. + +spkg: http://boxen.math.washington.edu/home/jdemeyer/spkg/cvxopt-1.1.3.p1.spkg

jdemeyer commented 12 years ago

Description changed:

--- 
+++ 
@@ -15,7 +15,7 @@
     libraries = ['m','lapack','gsl','blas','gslcblas','cblas','gfortran','atlas']

-At the end, cblas is added, even though this isn't installed on Darwin. This gives problem on OS X 10.4 with #12369 (without #12369, the first "if" evaluates to True). +At the end, cblas and atlas are added, even though they aren't installed on Darwin. This gives problem on OS X 10.4 with #12369 (without #12369, the first "if" evaluates to True).

In this ticket, I also fixed the "spkg-check" script, which will now actually fail if there is a failure.

jdemeyer commented 12 years ago

Description changed:

--- 
+++ 
@@ -15,7 +15,7 @@
     libraries = ['m','lapack','gsl','blas','gslcblas','cblas','gfortran','atlas']

-At the end, cblas and atlas are added, even though they aren't installed on Darwin. This gives problem on OS X 10.4 with #12369 (without #12369, the first "if" evaluates to True). +At the end, cblas and atlas are added, even though they aren't installed on Darwin. This causes a build failure for cvxopt on OS X 10.4 with #12369 (without #12369, the first "if" evaluates to True).

In this ticket, I also fixed the "spkg-check" script, which will now actually fail if there is a failure.

jhpalmieri commented 12 years ago
comment:5

In spkg-install, these lines should be removed:

# Solaris-specific and other patches
cp -p patches/cvxopt.h src/src/C/

# patch to use always use GSL's random
cp -p patches/__init__.py src/src/python/

# setting Sage-specific libraries locations, etc.
cp -p patches/setup.py src/src/

since they are taken care of by patch.

dimpase commented 12 years ago
comment:6

Replying to @jdemeyer:

Am I correct in saying that this was caused by Darwin being determined by the presence of G95, which is no longer the right way to do this (or perhaps it never was --- mea culpa then...) ?

Regarding the failures of tests on OSX 10.7 (see #12011): it would be great if the exact Sage version (gcc/clang?) /Xcode version/hardware state (32 vs 64 bits) on these are reported. As it looks at the moment, that there is a bug in Apple-supplied lapack/blas, and this might force us to build Atlas on OSX 10.7 and use it.

jhpalmieri commented 12 years ago
comment:7

Replying to @dimpase:

Regarding the failures of tests on OSX 10.7 (see #12011): it would be great if the exact Sage version (gcc/clang?) /Xcode version/hardware state (32 vs 64 bits) on these are reported.

In my experience, the tests fail on every OS X 10.7 machine I've tried, with Xcode 4.1, 4.2, 4.3, with both gcc and clang.

As it looks at the moment, that there is a bug in Apple-supplied lapack/blas, and this might force us to build Atlas on OSX 10.7 and use it.

That would be unfortunate. I also wonder why we don't see doctest failures elsewhere in Sage if there are bugs in these libraries.

jdemeyer commented 12 years ago

Attachment: cvxopt-1.1.3.p1.diff.gz

Diff for the cvxopt spkg, for review only

jdemeyer commented 12 years ago
comment:8

Fixed, needs review.

jdemeyer commented 12 years ago
comment:9

Replying to @jhpalmieri:

That would be unfortunate. I also wonder why we don't see doctest failures elsewhere in Sage if there are bugs in these libraries.

Because the cvxopt test suite itself is more complete than Sage? Maybe it just fails on some corner case which isn't in the doctests. I agree it would be a good idea to add a doctest to catch the OS X 10.7 failures.

dimpase commented 12 years ago
comment:10

Replying to @jhpalmieri:

As it looks at the moment, that there is a bug in Apple-supplied lapack/blas, and this might force us to build Atlas on OSX 10.7 and use it.

That would be unfortunate. I also wonder why we don't see doctest failures elsewhere in Sage if there are bugs in these libraries.

I asked upstream for examples demonstrating lapack/blas bug(s) on OSX 10.7. They are looking into it.

dimpase commented 12 years ago
comment:11

Replying to @jhpalmieri:

In my experience, the tests fail on every OS X 10.7 machine I've tried, with Xcode 4.1, 4.2, 4.3, with both gcc and clang.

here is a beautiful OSX 10.7-specific bug (got it on sqrt5 with Sage 5.0.beta6): (the syntax const-M, for M a matrix, is supposed to replace each M_ij with const-M_ij)

from cvxopt import matrix
b=matrix([ 1.00e+00, 1.63e-01, 2.83e-01, 9.47e-01, 2.32e-01, 4.85e-01,
9.57e-01, 7.44e-01])
c=1.-b
print "b :\n", b
print "1-b:\n", c

b :
[ 1.00e+00]
[ 1.63e-01]
[ 2.83e-01]
[ 9.47e-01]
[ 2.32e-01]
[ 4.85e-01]
[ 9.57e-01]
[ 7.44e-01]

1-b:
[-1.00e+00]
[-1.63e-01]
[-2.83e-01]
[-9.47e-01]
[-2.32e-01]
[-4.85e-01]
[-9.57e-01]
[-7.44e-01]

Interestingly, a shorter vector b works OK...

Do you have any idea what goes wrong here? (this is actually the bug responsible for failure of "acent" test in here)

jdemeyer commented 12 years ago
comment:12

Replying to @dimpase:

Do you have any idea what goes wrong here?

Probably Apple shipping a broken BLAS. See #12011.

dimpase commented 12 years ago
comment:13

Replying to @jdemeyer:

Replying to @dimpase:

Do you have any idea what goes wrong here?

Probably Apple shipping a broken BLAS. See #12011.

that's what one of CVXOPT people says (as I reported :)), but it would be good to know for sure. I've reported this example upstream (to CVXOPT), too.

I forgot to mention that this is a plain Python example. To run it in Sage proper, one needs to do first

sage: preparser(on=False)
dimpase commented 12 years ago
comment:14

Replying to @dimpase:

Probably Apple shipping a broken BLAS. See #12011.

confirmed! see the following sage-devel thread

jdemeyer commented 12 years ago
comment:15

This needs to be coordinated with #12011.

kcrisman commented 12 years ago

Attachment: test_blas.c.gz

For reproducing the failure on Lion

kcrisman commented 12 years ago

Attachment: test_cblas.c.gz

For reproducing the failure on Lion

kcrisman commented 12 years ago

For reproducing the failure on Lion

dimpase commented 12 years ago
comment:16

Attachment: Makefile.gz

Replying to @dimpase:

Replying to @dimpase:

Probably Apple shipping a broken BLAS. See #12011.

confirmed! see the following sage-devel thread

In fact, it turns out to be a CVXOPT bug. Please see this message and this discussion)

So we do not need Atlas on OSX 10.7, after all (or at least not yet).

dimpase commented 12 years ago
comment:17

Replying to @dimpase:

Replying to @dimpase:

Replying to @dimpase:

Probably Apple shipping a broken BLAS. See #12011.

confirmed! see the following sage-devel thread

In fact, it turns out to be a CVXOPT bug. Please see this message and this discussion)

So we do not need Atlas on OSX 10.7, after all (or at least not yet).

Hopefully CVXOPT people can fix this quickly: the affected BLAS calls are used quite a bit in their code, so I'd prefer to wait for them to have it fixed rather than going at it myself (it looks straightforward, but quite a bit of code...)

dimpase commented 12 years ago
comment:19

Please revert to using Apple's blas, not GSL's ones.

jdemeyer commented 12 years ago
comment:20

Replying to @dimpase:

Please revert to using Apple's blas, not GSL's ones.

Please clarify...

dimpase commented 12 years ago
comment:21

Replying to @jdemeyer:

Replying to @dimpase:

Please revert to using Apple's blas, not GSL's ones.

Please clarify...

Can we link cvxopt against Apple's cblas (aka "-framework Accelerate") on MacOSX? I suppose inclusion of gslcblas in the list of libaries means we use this very slow beast...

jdemeyer commented 12 years ago
comment:22

Replying to @dimpase:

I suppose inclusion of gslcblas in the list of libaries means we use this very slow beast...

Are you sure? If we're not using Apple's BLAS, how do you explain that cvxopt fails its self-tests only on OS X 10.7?

Anyway: let me also clarify that I'm simply restoring the old behaviour, such that cvxopt builds with GCC 4.6.2. So if there really is something to be done, I prefer to get this ticket reviewed as-is and continue on a different ticket.

dimpase commented 12 years ago
comment:23

Replying to @jdemeyer:

Replying to @dimpase:

I suppose inclusion of gslcblas in the list of libaries means we use this very slow beast...

Are you sure? If we're not using Apple's BLAS, how do you explain that cvxopt fails its self-tests only on OS X 10.7?

-lblas links against Apple's "classical" BLAS (which you call with Fortran conventions), and use names ending with '_'. And that's how CVXOPT makes these fateful calls to daxpy().

Actually, it occurs to me now that it's GSL extension of CVXOPT that needs its GSLCBLAS, nothing else. Sorry for noise. Please disregard my request on this.

Anyway: let me also clarify that I'm simply restoring the old behaviour, such that cvxopt builds with GCC 4.6.2. So if there really is something to be done, I prefer to get this ticket reviewed as-is and continue on a different ticket.

OK, I'll look into it ASAP.

dimpase commented 12 years ago
comment:24

Replying to @jdemeyer:

Replying to @dimpase:

I suppose inclusion of gslcblas in the list of libaries means we use this very slow beast...

Are you sure? If we're not using Apple's BLAS, how do you explain that cvxopt fails its self-tests only on OS X 10.7?

Anyway: let me also clarify that I'm simply restoring the old behaviour, such that cvxopt builds with GCC 4.6.2. So if there really is something to be done, I prefer to get this ticket reviewed as-is and continue on a different ticket.

looks OK on OSX 10.6.8, and on 10.5.8 PPC (installs with SAGE_CHECK=yes). I presume you have tested on OSX 10.4. Positive review. I'd like to coordinate the upgrade to (now available) cvxopt 1.1.4 with them fixing the OSX 10.7-specific bug.

jdemeyer commented 12 years ago

Reviewer: Dmitrii Pasechnik

jdemeyer commented 12 years ago

Merged: sage-5.0.beta8