Closed bgorissen closed 3 years ago
Thanks for the PR! Looks great.
There's a failing unit test because we can't actually guarantee that xtx is positive definite (only that it's positive semidefinite) and the 'pos' hint doesn't work as a result. That's not a big deal though, I can roll that back to symmetric.
Numba is a bit more of a problem. There's a lot of code out there that's hard to get running because of numba and the knock-on effect numba has on other dependencies. I don't want it as a required dependency for this project. That said, I'm fine with it as an optional dependency. So what I'm gonna suggest is that it gets rolled into an optional dependency (e.g. pass use_numba=True
and it'll use the numba routines).
I'll go ahead and make those changes and then merge it into the dev branch so it can be tested on the cluster.
Aight I went ahead and made those changes and merged them into dev. The implementation is that you can set wkf.set_run_parameters(use_numba=True)
and it will replace updateB
and updateS
with the numba JIT routines at runtime.
I will find some time next week to write some tests for this functionality so I can package this up as a release.
There's a failing unit test because we can't actually guarantee that xtx is positive definite (only that it's positive semidefinite) and the 'pos' hint doesn't work as a result. That's not a big deal though, I can roll that back to symmetric.
I'm not sure why the test fails. The test shows that LinAlgWarning
was issued, and there is code in base_regression.py to treat this warning as an error (resulting in beta_hat=0 being selected in base_regression.predict_error_reduction). Locally this test is passed.
Numba is a bit more of a problem. There's a lot of code out there that's hard to get running because of numba and the knock-on effect numba has on other dependencies.
What specific code are you referring to? Unless numba is imported and enabled for a specific function, it does not interfere with other code. Numba gives an order of magnitude improvement, so perhaps you can display a warning for low performance when use_numba=False
.
The scipy solve test failure is specific to newer versions of scipy (I didn't get it locally with 1.6.x but it failed when I updated to 1.7.1). It's not unexpected, as the test case is positive semidefinite, and the 'pos' hint requires positive definite.
The performance boost from numba is really impressive btw, I have the cluster regression tests going and they're way faster now. It's just that the dependency is hard to maintain (same as the math kernel library, which is why that's optional as well). I just don't want to be answering issues about cryptic runtime errors that occur when numba and numpy aren't exactly matching versions for the next three years.
The scipy solve test failure is specific to newer versions of scipy
That's not the point. The logs of the failed test show:
inferelator/tests/test_base_regression.py::TestBaseRegression::test_predict_error_reduction /home/runner/work/inferelator/inferelator/inferelator/regression/base_regression.py:259: LinAlgWarning: Ill-conditioned matrix (rcond=8.88178e-17): result may not be accurate. beta_hat = scipy.linalg.solve(xtx, xty, assume_a='pos')
Due to base_regression.py:15 this should be treated as an error:
warnings.filterwarnings(action='error', category=scipy.linalg.LinAlgWarning)
So this code (around base_regression.py:259) should run the except block:
try:
xt = x_leaveout.T
xtx = np.dot(xt, x_leaveout)
xty = np.dot(xt, y)
beta_hat = scipy.linalg.solve(xtx, xty, assume_a='pos')
except (np.linalg.LinAlgError, scipy.linalg.LinAlgWarning):
beta_hat = np.zeros(len(leave_out), dtype=np.dtype(float))
On my local machine, when I run python -m nose
, a LinAlgWarning is triggered, and indeed the except block is run. I'm at a loss why this doesn't work here or on your local machine. I've tried Scipy 1.4.1 and 1.7.1.
Sorry, I'm not sure exactly what the problem is. Scipy and numpy are backed LAPACK and BLAS so it's possible that it's a difference in those libraries.
The regression test package passed for amusr (with the numba JIT versions of the updateB
and updateS
routines) and runs substantially faster. It's failing for BBSR (the differences aren't huge, but they are there), so I'm going to roll those changes back.
I'll try to get tests written this week so that it can get merged into the stable version and released with a version number.
Thanks again!
Sorry, I'm not sure exactly what the problem is.
test_base_regression.py:test_predict_error_reduction gives the wrong output:
x: array([-116., -116., -116.]) y: array([-133.33, -133.33, -133.33])
where x is the output from base_regression.predict_error_reduction and y is the (hardcoded) expected output. The reason for this erroneous output is that beta_hat scipy.linalg.solve(xtx, xty, assume_a='pos')
gives beta_hat = [0.05 0.08333333]
instead of failing (in which case beta_hat = [0 0]
is used).
This is not an issue about blas implementations or the numerically vague boundary between singular and ill-conditioned matrices. Solve
throws a LinAlgWarning, so it should be possible to handle that warning and return beta_hat = [0 0]
. In fact, the code to do that is there, and works locally.
After some digging I found that the coverage module interferes with the warnings module. The same test passes with nose and there is nothing wrong with the code itself.
Oh, that's interesting; I wouldn't have expected there to be a problem with the warnings module. I'm running coverage with the test package locally (& it's part of the CI workflow), so that would explain it.
Good detective work.
The improvements to base regression (base_regression.py and bayes_stats.py) are:
For amusr, almost all computation time is spent in the functions updateS and updateB, which mostly contain simple loops and basic arithmetic operations. Compiling this code to machine code reduces the runtime with a factor of 10 (on my workstation without multiprocessing). For this I used numba, which only requires a few lines of code. Because numba cannot convert a matrix to column major order, which was a step in updateB, this is now done in the line that calls updateB (np.asarray(sparse_matrix, order="F")). It is not possible to check within updateB to check if the matrix that is passed is indeed in column major order (Numba does not support S.flags['F_CONTIGUOUS']), so this is something that requires care when updateB is used in a different context.