MKLab-ITI / pygrank

Recommendation algorithms for large graphs
Apache License 2.0
29 stars 4 forks source link

Possible sparse_dot_mkl integration? #18

Closed deklanw closed 1 month ago

deklanw commented 1 year ago

I saw that you have your own sparse matrix library called matvec which parallelizes sparse-dense multiplication. There is an existing Python library called sparsedot which does the same but with scipy csr/csc matrices https://github.com/flatironinstitute/sparse_dot.

I benchmarked the two with a matrix of size

<6357204x6357204 sparse matrix of type '<class 'numpy.float32'>'
    with 3614017927 stored elements in Compressed Sparse Column format>

With the 32 core/64 thread server CPU I'm testing on the times for 10 matrix-vec multiplications on the right and left are:

matvec
right-vec  25.15
left-vec  19.47

sparse_dot csc
right-vec  40.17
left-vec  14.91

sparse_dot csr
right-vec  10.38
left-vec  28.53

The times look competitive. I'm not sure if matvec has some other advantages I'm not considering here, but that sparsedot works with the existing scipy types would be a huge benefit (for my usecase, at least). Sparsedot does require installing the mkl library and for giant matrices as above requires the environment variable MKL_INTERFACE_LAYER=ILP64.

maniospas commented 1 year ago

Thanks a lot for bringing this library into our attention.

matvec was created because I failed to find this functionality, probably because I was searching for the wrong keywords. That said, if you send a wrapped scipy matrix to pygrank after enabling the matvec backend, the library should work just fine (if an error occurs, please report the stack trace and offending code because backends are interoperable for forward computations: once you input data to the library they should yield the same outputs barring numerical precision differences). Though matvec uses twice the memory because it internally copies scipy data once.

The differences in times you are reporting in simple multiplication lie more in that matvec stores matrices in coo format so it has a different parallelization strategy as a result - I think it can be even faster in these specific cases if it's made to handle csr or csc matrices, but since it's a side-project I don't know if I will get to work on this. At any rate, the format for the fastest multiplication strategy differs depending on graph characteristics (especially the average and standard deviation of node degrees).

Matvec has some memory reuse and parallelized element-by-element matrix operation optimizations, so in theory it yields sub-linear running times with respect to the number of nodes too (instead of only edges).

I am looking into creating a sparse_dot_mkl backend. This is trivial development-wise (needs 5 lines of code), but I have little experience with setting up MKL and everybody seems to be using it out-of-the-box with conda, whereas pygrank's development and CI pipelines use pip. I will update this issue if there is progress on that front (releases are not to be published without passing both local and CI tests).

deklanw commented 1 year ago

Glad it's helpful.

What about checking for the presence of MKL when the numpy backend is used and using sparse_dot_mkl if it's present, otherwise warning that installing it could come with significant speed-ups?

maniospas commented 1 year ago
This creates a bunch of issues that are a nightmare to address from a library developer's standpoint. - There is a performance warning out-of-the-box. - The default library implementations would not run the same way on each machine; this creates maintainability issues for dependent software projects. - Given that numpy is the default backend, it creates an explicit (instead of optional) dependency to an infrequently-updated project. - As I have found, integrating MKL is not easy in all environments, so I don't think this library should be responsible for promoting its use by default. - `sparse_dot_mkl` and `mkl` may have been successfully installed as packages but the code would fail to run due to bad linking.

After thinking about it, the inverse could be done: have the sparse_dot_mkl backend use numpy with a warning if linking fails. For the time being, I will move in this direction.

maniospas commented 1 year ago

Released version 0.2.10 with a sparse_dot_mkl backend that switches to numpy. It passes local tests, but, if possible, can you also confirm @deklanw that it works on your end?

maniospas commented 1 month ago

Tested on Intel's Python distribution. The "sparse_dot_mkl" backend works correctly but is similar in speed to the "numpy" backend when numpy and scipy are compiled with mkl.