typelevel / spire

Powerful new number types and numeric abstractions for Scala.
http://typelevel.org/spire/
MIT License
1.76k stars 243 forks source link

Support for linear algebra in Spire (matrices, vectors) #675

Open denisrosset opened 6 years ago

denisrosset commented 6 years ago

This issue collects PR or "new feature" issues concerning the support of linear algebra in Spire. At this time, it is not clear if this should be done in Spire itself or in a companion library. Points to consider: have first support of complex vector spaces in spire-core; decide which level of type-safety is proposed ...

166 Eigenvalue computation PR (2013)

171 Gaussian elimination PR (2013)

628 Proposal by @zainab-ali to support vectors (2017)

Wishes

220 DFT and FFT

375 Vector cross product

lJoublanc commented 6 years ago

Over xmas I wrote a working implementation of VectorSpace using net-lib blas (using netlib-java as per #184). It only supports L1 routines (these are essentially binary ops between 1-D vectors and/or scalars - no matrix algebra). I want to share some of my thoughts here. The code is available here:

https://github.com/lJoublanc/spire-matrix

It's only a POC and is not used in production - I uploaded it so I can reference code in some of the discussion points below.

A lot of the design decisions I suggest below are based on those made in BLAS. It is well documented and you should read the technical standards for more detail.

Algebra

ASTs

BLAS has a large number of routines, many of which serve the same purpose. For example, you can use both L1 and L3 routines to do matrix addition - but one may be more efficient than the other. I think a major feature of a linear algebra package should be the ability to transform scala expressions into the most appropriate combinations of BLAS calls. Not many libraries do this - for example I recall looking at Breeze code and finding that it only uses a couple BLAS routines.

To give an example : in my POC I use a typical FP approach - structural recursion - to pattern match expressions and convert these into BLAS routines. For example:

  /** Matrix addition */  
  override def plus(x: Matrix, y: Matrix): Matrix = (x,y) match {
    case (SCAL(α,x),y) => AXPY(α, x, y)
    case (x,SCAL(α,y)) => AXPY(α, y, x)
    case _ => AXPY(scalar.one, x, y)
  }

In this trivial implementatin we are using the AST to optimize the expression αX + Y into a single linear operator axpy rather than two operations (scalar multiplication and vector addition). It also would compile X + αY to the same operation - something a trivial translator wouldn't do. I believe that having a good interpreter that can do a full set of such optimizations at compile-time should be a major goal of a linear algebra BLAS implementation. This in turn depends on having a rich type system that allows having specific ASTs for the interpreter to work with.

In addition to the AST We can use the BLAS signatures to give us some hints as to what abstractions as types are useful. The cryptic BLAS procedure names actually stand for the type of the inputs/outputs, and encode for example:

Representing such rich types would have added advantages over just BLAS optimization (e.g. there are specialised Upper-Triangular operations). For example, matrix inverse does not exist for non-square matrices - something that could be checked at compile-time based on the type. Another example would be the transpose of a triangular matrix. There are all manner of properties that can be checked at compile time or used for optimization - someone with a stronger mathematical background than me would be able to elaborate further!

A word about the implementation (structural recursion) : this allocates intermediate objects and should be avoided for a production implementation. I used it for it's ease of implementation for this POC. But I'm looking at a talk by Dave Gurnell on Church encoding and also the multi-stage-programming approach that Michael Pilquist took in fs2 as alternatives.

Category of Vector Spaces

Since the current spire typeclasses don't include matrix multiplication, this needs to be added.

I also noted that matrix multiplication can be seen as a 'vector space of vector space'. This means you shouldbe able to 'lift' a Vector[Vector[Double]] into a Matrix[Double] and vice-versa.

What is really interesting about this is that you can use a BLAS L1 (i.e. 1-D vector) op and a L3 (i.e. 2-D matrix) op on the same data type, simply by specifying the column length (or 'stride' in BLAS terms). That is to say, both take the same contiguous array as input. So this suggests that with the right data-type, the abovementioned lift operation should be O(1), requiring no copies, provided the underlying data type is a contiguous c-like array.

BLAS aside, this would also allow the definition of a 'default' typeclass for matrix multiplication using this 'category enrichment'. e.g.

def mmult[T : Field](a,b) = {
  as : Vector[Vector[Double]] = a.unlift
  bs : Vector[Vector[Double]] = b.transpose.unlift
  for { a <- as
        b <- bs
  } yield a vecmult b
}

It would be great if 2-D matrices and 'vectors of vectors' could be transparently interchanged, and if the actual type hierachy would covary like Matrix[T] extends Vector[Vector[T]]. This is especially important for me to be able to process matrix data which is streaming in one dimension e.g. Stream[IO,Stream[Pure,Double]].

I was not able to implement this. In fact my implementation does not make distinction between Vectors and Matrices properly - the latter being important to BLAS because the data order (column major vs row major) is important.

Effects

Currently there is no canonical tensor data type in spire. For my POC I implemented one. But this in fact lead me to see what the advantages of keeping the effect 'free', rather than creating a specific tensor type as suggested in #628. I can think of three interesting use cases, in addition to the canonical Array:

scala.collection.mutable.Array

The canonical heap-allocated representatin in scala, and has nice collection library API. In addition, this is the type that was chosen by net-lib java (#184) as it's tensor type. However, note this requires heap allocation i.e. a memcopy, when calling BLAS.

java.nio.ByteBuffer

I think a better alternative would be java.nio.ByteBuffer and subclasses DoubleBuffer etc. This would allow memcopy-free operation. Note that being able to use this on top of BLAS would require upstream changes to netlib-java, as it currently only supports Array as the tensor type. Changing this would be a huge job. However, it would open the door for large matrices that don't fit in memory, via mmapped files using java.nio.MappedByteBuffer. If using Array, we need to copy everything onto the heap and then back out of the JVM again. While the runtime may be able to optimise this, it still creates a lot of unnecessary memory use and GC in this particular use-case.

cats.data.ReaderT

One of the unfortunate things about BLAS is that most of it's procedures mutate data in place. Specifically, one of the inputs tends to be used as an output. E.g. operation αX + y I mentioned above writes the result in X. In my implementation I worked around this by implementing a concrete data type with lazy accessors in BlasOps.scala, and copying the output variable to a separate instance. i.e. accessing Matrix.values triggers compilation and makes defensive copies behind the scenes; this is transparent to the user, but introduces side-effects. UsingReaderT[Array] or a similar transformer would allow keeping the algebra pure, and specifying the output variable only when evaluating ReaderT.

fs2.Stream

One of the initial goals of my implementation was the support for streaming data. This fitted in with the spire use of typeclasses nicely. fs2 is a streaming library with a 'chunked' structure, which would allow vector operations on stream chunks. The problem with this, and other streaming libraries, is that while they are monads, they don't implement other categories like 'foldable' or 'traversable'. A practical way of looking at this is that fs2.Stream operations never leave the effect, so e.g. (s : fs2.Stream[A]).fold(???) : Stream[A], whereas (s : scala.collection.immutable.Stream[A]).fold(???) : A (note the return type). Implementing a standard Vector data type would need a new typeclass that returns effectful values only (or be parametrised on F[_]).

Data Types

There are reasons against this in the Effects section (or at least reason to parametrize the values as a functor), but there are also reasons in it's favour.

As mentioned above, BLAS's procedures take a number of argument types : 1-D, 2-D, Dense, Sparse, Square, Triangular etc. matrices. These are both good because they allow checking the AST at compile time, as well as optimizing BLAS routines. We should try to include as many of these as possible in a linear algebra implementation.

Issue #628 discusses creating a vector type, parametrized by size. In the POC I chose to do this using ValueOf, to avoid the shapeless dependency, but also because it allows using tensors of unknown length. These occur in the examples I gave above (reading from a file, or streaming data). In this case, the size type parameter can be passed as Nothing or Int, esssentially matching the dimensions automatically, and 'turning-off' compile time size checks. One key thing here is that, for a linear algebra implementation only, I think you don't need to be able to count the dimensions. It is enough to check equality. e.g. adding vectors must have the same size, and multiplying two matrices m x n and n x m is allowed - but in neither case do we need to know e.g. if m >= n. Conversely, in practice we want to be able to manipulate matrices. i.e. operations like take, drop, reduce etc. Such operations do require being able to count (maybe with the exception of take[N]).

I also have an alternate implementation that uses tagged types - the idea behind this was to avoid allocation of new objects, but track vector sizes. The problem with this however, was a user could 'escape' from the ADT by calling the tensor methods. So for example, you could do (a : Array[Double] with Cols[3]).take(2) and you would get back an Array[Double] rather than an Array[Double] with Cols[2], loosing the type-safety inadvertedly.

Miscellaneous

denisrosset commented 6 years ago

Wow, that's really neat! I must say that we are not there yet in Spire.

For example, our inner product spaces / normed vector spaces are implicitly real, and the latter should be implicitly closed under square roots.

So before moving towards linear algebra, we need to clean those. We are keeping modules and vector spaces as they enable nice syntax (:* and *:), but the more structured types are going to legacy until we are happy about the design.

Given that you explored the problem space so well (I'm guilty of a happily-mutating-non-typesafe library myself https://github.com/denisrosset/scalin/ ), are there bits and pieces that should be reused across linear algebra libraries?

Spire is by design boring and uncontroversial: that would probably mean tensor/matrix/vector types (typeclasses?) that would enable data interchange between libraries.

Having a Show implementation is cool as well. As of now, we do not have a spire-cats module (Show is not in cats-kernel), but I'd gladly merge a PR that would introduce such a module.

Concerning testing doubles: the Order thus defined will not obey the Order laws, so until we have a alleycats equivalent, I'm reluctant of adding it as it is.

denisrosset commented 6 years ago

Note: please send a PR to add your matrix library to the FRIENDS.md list; you can mention it is a prototype.

lJoublanc commented 6 years ago

Having a Show implementation is cool as well. As of now, we do not have a spire-cats module (Show is not in cats-kernel), but I'd gladly merge a PR that would introduce such a module.

Another thing along these lines I didn't explore was having a typeclass for richer output. E.g. for interop with notebooks - producing HTML (Math ML?) or LaTeX instead of plain text. This I will do for sure at some point, to get nice matrix output in a notebook :smile: . If you're considering adding a typeclass for display, you may want to think of something like that (or just have multiple show instances).

Concerning testing doubles: the Order thus defined will not obey the Order laws, so until we have a alleycats equivalent, I'm reluctant of adding it as it is.

That should be straightforward. The whole idea behind this is that the IEEE floating point spec is ordered - it allows casts into integrals which are ordered (not quite sure how this works - but it does! see ULP post) I'm more concerned that there is actually a numerical error after which equality fails - so the tests pass, because they only ever do up to 2 consecuitive operations, but any more than this (i.e. real code) and the error accumulates, and equality fails. I'm pretty sure this can be analysed and therefore equality can be 'aware' of this, but it requires some work.

denisrosset commented 6 years ago

One of the order laws is transitivity: if we have x=y and y=z, then x=z.

This breaks down if we compare numbers up to a certain precision: we could have abs(x-y)<eps and abs(y-z)<eps, while abs(x-z)>eps.