Closed senchromatic closed 3 years ago
The performance benefit of using NumPy (especially for large matrices once we are using real data) is that broadcastable arithmetic operations implemented as ufuncs (such as matrix-matrix and scalar-matrix arithmetic) release the Python Global Interpreter Lock because the NumPy functions are really bindings for a multithreaded C implementation. Therefore, parallelization is built in when using NumPy. Pure Python implementations like the two above do not release the GIL and, even if they did, they are still doing all arithmetic operations on matrices as loops.
The performance boost will probably help when working with real data resulting in large matrices, especially if we can take advantage of parallelization by running on a VM with many cores.
Suggest closing. @senchromatic
Fine with closing as well. There's a chance it comes in handy but I expect us to be using datasets that are too large for it to be useful.
Thanks for taking a look. While we don't yet have a reason to optimize prematurely for performance, more importantly I'd rather avoid external dependencies like these unless they're very mature and efficient (like C++ Armadillo or R H2O). Since we're still using numpy and not re-inventing the wheel at that level, I think we should be okay with the existing unit tests.
Sounds good to me. We can always expand unit tests to cover possible edge cases we can think of, but that has its own issue we can assign PRs to.
Source #1: https://pypi.org/project/pyfinite/ Source #2: https://github.com/glassnotes/PyniteFields