GeoStat-Framework / pentapy

A Python toolbox for pentadiagonal linear systems
https://pentapy.readthedocs.io/
MIT License
14 stars 4 forks source link

[EXTENDED] PR 26 Parallelized multiple right-hand side support, fully typed `tools` #27

Open MothNik opened 3 months ago

MothNik commented 3 months ago

⚠️ This pull request can fully replace #26 which can basically be closed ⚠️

Update June 10, 2024

I temporarily converted this to a draft because the decisions for #28 heavily affect this PR because this might involve another change in the Cython-level API.

Changes

This pull request is an improved version of #26, so please refer to the basic changes in there. On top of these changes, this pull request

Since all that is a breaking change and the Cython interface is not backwards-compatible, I suggest that this is a major version jump from 1 to 2 (maybe as an $\alpha$-version of it?). Again, I tried to update the changelog, but it might be that you need to still look over it.

Tests

The installation and parallelisation were tested on Windows 11 and Ubuntu (WSL). ✅ Tests now cover serial and parallelel solves, but they have to run with

pytest --cov=pentapy .\tests --cov-report html -x 

now to prevent interfering with the multithreading. ✅ ⚠️ Note that the -x cancels the process for the first failure which is mandatory for this delicate topic where we deal with pointers ⚠️ Depending on the OS, it might be that the doctests of tests/util_funcs fail because the Array output includes the dtype on Windows while it does not on Linux. ❌ Given that the C-implementation is already very fast in serial, the parallelization does not cause a quantum leap. With 8 threads on a relatively old laptop (there it is, the grain of salt 🧂), I observe a threefold speedup for huge systems (1,000 x 1,000 with 10,000 right-hand sides):

image

On massively parallel systems, this will give better speedups. However, it does not negatively affect the serial solves when workers=1:

ParallelizedSingleRun

MothNik commented 3 months ago

The last commit currently does not change the internal logic at all. However, it already forms the foundation of future validations, e.g., by means of the condition number of the matrix. Right now, pentapy does never assess or allow the user to assess the quality of the solves because the result not being all np.nan, this does not mean that the solve was meaningful. If the user wants to have the results validated, one could introduce validate to the Python high-level interface of solve. The C-level interface was broken anyway, so I tried to keep it flexible for all given future scenarios. Right now, the validation logic does not trigger anything, so this buys flexibility in the future at no cost today.

The idea was taken from LAPACK. It returns an info-value together with the solution. It is usually 0 to indicate success while all other values encode certain warnings or even errors.

MothNik commented 3 months ago

To be able to cope with #28 and #29 in the future without compromising anything on the solvers side, I rearranged the Cython core (not the interface) as follows:

This is probably a bit too abstract (I barely know how phrase it), so please let me provide the following
Analogy Example In the basic principle, this is identical to how the banded Cholesky decomposition of SciPy works. For solving a system, there is scipy.linalg.solveh_banded. However, there is also scipy.linalg.cholesky_banded that can perform a banded Cholesky factorization and scipy.linalg.cho_solve_banded that can take the banded Cholesky factorization to solve one or more systems.

As of now, this PR basically implements the scipy.linalg.solveh_banded-analaogy of pentay (factorize and solve in one go), but on the Cython-side I arranged everything that it's super simple to also add the analogies of scipy.linalg.cholesky_banded (factorize only) and scipy.linalg.cho_solve_banded (solve system using a given factorization). But this would probably explode the frame of this already big PR 🤯

MothNik commented 3 months ago

@MuellerSeb Sorry for all these iterations. It tooks me some time to get the right strategy for Cython, the GIL, Memory Allocation, the Parallelization, and a testing framework (they all pass still ✅). But now I feel that this PR is ready for review and I would highly appreciate your feedback 😄 I really wanted to keep everything flexible for future adjustments. It might be true that pentapy is "only" there for solving pentadiagonal systems, so the number of features will never be as mindblowing as SciPy, but playing with the factorizations (like with the determinant) is a common step in mathematical problems and I really wanted to pave way for that 🛣️