lessthanoptimal / ejml

A fast and easy to use linear algebra library written in Java for dense, sparse, real, and complex matrices.
https://ejml.org
565 stars 117 forks source link

Support for IMatrix* and Graph Theory Operations #68

Open breandan opened 4 years ago

breandan commented 4 years ago

I see that ejml has support for logical matrices. Would you be open to including integer matrices, or is this currently out of scope? These have a number of applications in graph theory and discrete math.

lessthanoptimal commented 4 years ago

The support is fairly minimal, mostly as a way to mask elements in a matrix. However is there much overlap between linear algebra and the techniques you mentioned?

breandan commented 4 years ago

There are a number of interesting connections between linear algebra and graph theory, e.g. you can take eigenvalues of graph laplacians or invert the adjacency matrix. For a short survey of some applications, see Johnson or maybe Spielman if you're looking for something more comprehensive.

edit: You could probably use FMatrix* or DMatrix*, but I'm not sure if certain operations might be numerically less stable than having a native integer matrix type. Otherwise I guess it shouldn't be a major issue.

breandan commented 4 years ago

You might also check out the GraphBLAS library and the Graph Algorithms in the Language of Linear Algebra book. There is a lot of interest in sparse matrix representations for graphs, it would be great to have a native linear algebra library with support for boolean and integer matrices on the JVM.

lessthanoptimal commented 4 years ago

Do you @szarnyasg ? He's also interested in linear algebra and graphs. I'm not opposed to adding an integer matrix type, but all the algorithms to make it useful. Right now I just don't have a personal need.

szarnyasg commented 4 years ago

Hi,

it would be great to have a native linear algebra library with support for boolean and integer matrices on the JVM.

I agree but its a big undertaking. Adding an integer matrix type sounds fairly simple but then you also need to support add various for semirings and, if you want to apply the tool to solve big graph problems, it needs to run in parallel as well.

Tim Davis gave a great talk yesterday at the SIAM Minisymposium on Linear Algebraic Tools for Graph Computation, going into detail about how he parallelizes matrix multiplication algorithms. It is fairly obvious that these techniques (many of them because you different ones to cover the various setups regarding graph size, degree distribution, available threads and memory, etc.) take years to implement.

Gabor

breandan commented 4 years ago

Are you aware of any JVM bindings for GraphBLAS? Maybe it would be possible to benefit from Tim’s work without reimplementing the wheel here. Is it worth starting a new project? Seems like there are already a bunch of graph libraries which could potentially benefit from sparse matrix representations.

szarnyasg commented 4 years ago

@breandan yes, there's someone working on this, see: https://github.com/fabianmurariu/graphblas-java-native So far, it only supports Linux but there's no fundamental limitation to extend it for other OSs.

breandan commented 4 years ago

In case you do end up adding this feature, I've found that using FMatrix* or DMatrix* is numerically unstable under matrix powering. There are also number of performance benefits for using native ints and booleans, of which I'm sure you're aware. @szarnyasg What kind of semiring support are you referring to? It looks like EJML already supports Complex_F64 and I assume a similar structure could be used for other algebras.

szarnyasg commented 4 years ago

@breandan min-plus (where min is the addition operator, and plus is the multiplication operator for shortest paths, max-plus for maximal matching. There are many other practical ones (min-max), see the last page of http://www.mit.edu/~kepner/GraphBLAS/GraphBLAS-Math-release.pdf.

szarnyasg commented 4 years ago

@breandan PR #75 was merged and there is a new issue #82 for using semirings. What should we do with this issue?

lessthanoptimal commented 4 years ago

Does adding an IMatrix type solve any problems now that there are semirings?

breandan commented 4 years ago

I still think it would be productive for correctness and efficiency to have proper boolean/int matrices, but agree with @FlorentinD it would be better to wait until Valhalla/JEP 218 lands to implement these cases. Until then, users should be able to handle the conversion between double and int/boolean as needed. If that makes sense, feel free to close this issue. Thanks!

lessthanoptimal commented 4 years ago

JEP 218 may or may not be usable right away. Basically EJML is used on a lot of different versions of Java and needs to maintain Android and Kotlin compatibility. As a result the generated byte code must be 1.8 compatible. Thanks to some hacks Java 14 syntax can be used but none of the new API's since 8 can be used. If JEP 218 is just syntax sugar like switch expressions then it could be used right away.

FlorentinD commented 4 years ago

I see the main problem in having operators which support combination of different typed matrices, e.g. mxn(BMatrix, DMatrix). The same applies for my newly introduced DBinaryOperator, where we would then need f.i. DBBinaryOperator (Double, Boolean). But I guess this is all do-able via code generation. Project Valhalla would just make it easier to implement as we could use generics then.

I won't be able to look into this while I am working on my thesis. Maybe I could pick this up in February next year.

lessthanoptimal commented 4 years ago

If we just said screw it to efficiency and got something that worked using the most generic code possible would it be easy then to get some functionality? Having something that works even if it's slow is better than nothing.

FlorentinD commented 4 years ago

Regarding functionality I think using doubles is sufficient for now. The main benefit I see is efficiency.

What do you think @breandan @lessthanoptimal ?

lessthanoptimal commented 4 years ago

I was also thinking of using lambdas or anything else very generic. I've not looked close enough into these problems to know if that would help. Somewhat related, I've added more support for lambdas in dense operations.