Consider existing packages for matrix algebra on finite fields

senchromatic commented 3 years ago

Source #1: https://pypi.org/project/pyfinite/ Source #2: https://github.com/glassnotes/PyniteFields

aepereira commented 3 years ago

pyfinite: Raw Python nested loops for all operations. No broadcasting/vectorization of operations for multithreaded computation. Probably very slow for large matrices compared to NumPy.
PyfiniteFields: Matrices not yet supported. Also just raw Python.

The performance benefit of using NumPy (especially for large matrices once we are using real data) is that broadcastable arithmetic operations implemented as ufuncs (such as matrix-matrix and scalar-matrix arithmetic) release the Python Global Interpreter Lock because the NumPy functions are really bindings for a multithreaded C implementation. Therefore, parallelization is built in when using NumPy. Pure Python implementations like the two above do not release the GIL and, even if they did, they are still doing all arithmetic operations on matrices as loops.

The performance boost will probably help when working with real data resulting in large matrices, especially if we can take advantage of parallelization by running on a VM with many cores.

aepereira commented 3 years ago

Suggest closing. @senchromatic

mfleduc commented 3 years ago

Fine with closing as well. There's a chance it comes in handy but I expect us to be using datasets that are too large for it to be useful.

senchromatic commented 3 years ago

Thanks for taking a look. While we don't yet have a reason to optimize prematurely for performance, more importantly I'd rather avoid external dependencies like these unless they're very mature and efficient (like C++ Armadillo or R H2O). Since we're still using numpy and not re-inventing the wheel at that level, I think we should be okay with the existing unit tests.

aepereira commented 3 years ago

Sounds good to me. We can always expand unit tests to cover possible edge cases we can think of, but that has its own issue we can assign PRs to.

senchromatic / topological-data-analysis

Consider existing packages for matrix algebra on finite fields #9