How to work with block matrices?

kxygk commented 5 years ago

I'm currently trying to brush up on my linear algebra a bit on the side by going through a textbook. I had a pretty good go with just implementing everything in straight ELisp, but as the algorithms get more complicated I've been having more and more problems as I hit all sorts of weak spots in my naiive matrix datastructures.

So I'm trying to switch over to Neaderthal to make my life a bit easier and so I can focus more on the core algorithms. One core issue I've faced in my ELisp implementation is that a lot of algorithms rely on recursive block matrix operations - but I'm not seeing any facilities for working with block matrices in Neanderthal (probably do to my not understanding BLAS/Clojure sufficiently!)

I'm not really sure I'm looking to do too much here.. just ways to break matrices into blocks and stitch matrices together from blocks. Maybe this is already possible with the existing API? (the first part I can probably cobble together with submatrix) I just can't piece it all together in a coherent workflow at the moment. Any pointers would be appreciated :)

kxygk commented 5 years ago

I've looked into this more and thought about it more... More specifically, is there something like MATLAB's horzcat and vertcat? (or core.matrix's join-columns/rows) I think with those you don't really need block matrices per se

blueberry commented 5 years ago

@geokon-gh Sorry for the late answer. Neanderthal supports fluokitten, so you're looking at fluokitten.core/op which "concatenates" two matrices (but only when it's physically possible). Of course, it may be better performance-wise to explicitly create the resulting matrix and then copy! into its submatrices if you do such things in a loop.

kxygk commented 5 years ago

Thanks for much @blueberry and I really appreciate the work you've put into this and the accompanying documentation and tutorials. I'm not really familiar with category theory, so it didn't occur to me it'd be related (and even after reading the API I can't see the immediate connection :) )

My current solution is here: https://geokon-gh.github.io/linearsystems-part2/ (it's very much WIP) I ended up using core.matrix unfortunately. The code pretty ended up clear and concise but not tail recursive and probably not performant at all. I like your suggestion with using copy and I will take another look at the problem and see if I can re-imagine the solution. At first glance I think it should be very doable, especially since I can recursively call submatrix and not incur any extra copying.

I get why with memory layout issues slapping on both rows and columns quick can be impossible. So I guess that's why the BLAS interface provides no functionality for that, though it does end up making matrix manipulation more awkward

If I end up rewriting things with neanderthal I will drop a note here

Thanks again for the pointers. It's been very useful

kxygk commented 5 years ago

Hey @blueberry I'm having a lot of issue translating high level concepts to neanderthal - whats the appropriate forum to ask questions? (maybe Issues is the right place) I also feel like it's in large part an issue of working with BLAS (and in effect not related to your library) and was wondering if you had any resources to recommend to learn to work with BLAS. I'm comfortable with the math, but the leap from math to BLAS code is slowing me down

Simple things building a elementary reflector with BLAS is leaving me confused as to the proper way to do it. I tried to translate this little part to neanderthal.

And just making a function that does (I-2uut/utu)x isn't very straightforward. For instance I tried to write the identity matrix minus some random matrix and I ended up with this:

(axpy! -1 (dge (dgd 3 (repeat 3 1))) (dge 3 3 [1 2 3 4 5 6 7 8 9]))

But I'm not sure if this is kosher. There is no identity matrix constructor and I need to wrap the diagonal matrix in a general dense matrix - so I'm left feeling I'm doing things wrong :)

blueberry commented 5 years ago

It's not. If you just indiscriminately create objects and then transfer them into other objects you're going to waste lots of resources, and leak memory if you go to GPU.

In that particular code, you can do

(def I  (entry! (dgd 3) 1))
(def a (dge 3 3 (range 9)))
(axpy! -1 (dia i) (dia a))

A now contains A - I. Or, instead of dgd, you could do:

(def ones (entry! (dv 3) 1))
(axpy! -1 ones (dia a))

kxygk commented 5 years ago

Sorry for the slow reply @blueberry . Thank you for your guidance here :)

I'm a little unclear on when it's appropriate or not to transfer between object As I was trying to get the difference of the two matrices, I was looking to do something more like

(axpy! -1 (dge I) a)

Which works, but I'm transferring the diagonal identity matrix into a full matrix. And it's only diagonal for the convenience of the constructor. Would you do it another way?

And are you not using any convenience wrapper to make things like identity matrices? Or is this just not really possible because of the details that come up in a case by case basis?

blueberry commented 5 years ago

just keep the diagonal vector of ones and do:

(axpy! -1 ones (dia a))

kxygk commented 5 years ago

Maybe I'm misunderstanding. This is giving me just the diagonal elements, while I'm trying to construct the full matrix I - A

blueberry commented 5 years ago

Since axpy! is mutable, mutating the diagonal vector will also have effect on the encompassing matrix. See the elements of a after this operation.

kxygk commented 5 years ago

After that insight I think I'm seeing it better now. The functions are either presenting new views on underlying data or mutating the data. I managed to rewrite a small section using neanderthal now and it's been challenging/interesting to try to massage the problem into the MKL framework. I have a working solution now - though I may be doing something a bit ugly here:

(defn outer-product
  "Returns the outer product of a vector"
  [input-vector]
  (let [dimension (dim input-vector)]
    (mm (view-ge input-vector dimension 1) (view-ge input-vector 1 dimension))))

(defn elementary-reflector
  "Build a matrix that will reflect vectors across the hyperplane orthogonal to REFLECTION-AXIS"
  [reflection-axis]
  (let [dimension (dim reflection-axis)
        identity-matrix (entry! (dgd dimension) 1)
        outer-product-matrix (outer-product reflection-axis)]
    (axpy! 
     (dia identity-matrix)
     (dia (scal! (/ -2 (dot reflection-axis reflection-axis))
                 outer-product-matrix)))
    outer-product-matrix))

1 - The view-ge in the outer-product look a bit suspect to me. The docs are a bit vague, as I don't know what "Attach a GE matrix to the raw data of a" really means. For instance both (view-ge (dv (range 3)) 1 3) and (view-ge (dv (range 3)) 3 1) return a matrix with layout: column, however the underlying data is the same, so it seems to me the layout should be different (though in this 1D case it makes no difference I guess)

2 - I had to write the axpy! a bit backwards (scaling the second term) b/c I couldn't put the diagonal identity matrix as the second mutating matrix argument. I couldn't find another way to do this - it might be suboptimal.

Thanks for all your help and patience Dragan

uncomplicate / neanderthal

How to work with block matrices? #51