gonum / matrix

Matrix packages for the Go language [DEPRECATED]
446 stars 51 forks source link

mat64.Dense doesn't implement Transposer #37

Closed jonlawlor closed 9 years ago

jonlawlor commented 9 years ago

Looks like an oversight - it does implement TransposeCopier.

kortschak commented 9 years ago

In place transposer without allocation is difficult.

jonlawlor commented 9 years ago

Could it be done with a "transposed" field, that is then used by all of the various Dense methods? Please pardon my ignorance, I am only beginning to understand the matrix package.

btracey commented 9 years ago

It sounds great, and for the user probably it is, but the problem is that it doubles the coding effort under the hood. Every algorithm for Dense needs two implementations; one for transposed and one for untransposed, and 4 implementations if two matrices are involved. It might be nice to have a method for transpose-multiply (which is very common), but in general there is a lot of extra code that comes along with supporting Transpose. For similar reasons, we removed column-major support from BLAS.

UserAB1236872 commented 9 years ago

We could do some sort of lazy, cached evaluation of a transpose field

Add a field like: type Dense struct { t *DenseTranspose mat RawMatrix }

func (mat *Dense) T() *DenseTranspose {
    return mat.t
}

Where DenseTransponse is

type DenseTransponse struct {
    owner *Dense
    t         *Dense
}

DenseTranspose implements everything in Dense via a wrapper (or at least the read-only methods), if dt.t is nil, the owner's transpose is calculated and stored in t, followed by a call. E.G.

func (dt *DenseTranspose) At(r, c int) float64 {
    if dt.t == nil {
        dt.owner.TCopy(dt.t)
    }
    return dt.t.At(r,c)
}

The only issue is remembering to invalidate the transpose when something changes, which is a simple fix (albeit one that needs a lot of tests to verify correctness). You can do the same trick with the matrix inverse with an I field.

On Fri, Nov 14, 2014 at 9:26 PM, Brendan Tracey notifications@github.com wrote:

It sounds great, and for the user probably it is, but the problem is that it doubles the coding effort under the hood. Every algorithm for Dense needs two implementations; one for transposed and one for untransposed, and 4 implementations if two matrices are involved. It might be nice to have a method for transpose-multiply (which is very common), but in general there is a lot of extra code that comes along with supporting Transpose. For similar reasons, we removed column-major support from BLAS.

— Reply to this email directly or view it on GitHub https://github.com/gonum/matrix/issues/37#issuecomment-63161409.

kortschak commented 9 years ago

I really don't want to add complexity unless there is an overwhelming performance benefit.

jonlawlor commented 9 years ago

Accidentally closed this. Here's a set of papers that discuss the issue. It does look pretty hard.

All of this starts off with a paper on FFTW3, which performs fast fourier transforms. The project itself is on github. Frigo, Matteo, and Steven G. Johnson. "The design and implementation of FFTW3." Proceedings of the IEEE 93.2 (2005): 216-231.

It mentions that the algorithms depend on the input matrix.

if it is square: Frigo, Matteo, et al. "Cache-oblivious algorithms." Foundations of Computer Science, 1999. 40th Annual Symposium on. IEEE, 1999.

if there is a large common factor in the dimensions, or a small difference between the dimensions: Dow, Murray. "Transposing a matrix on a vector computer." Parallel computing 21.12 (1995): 1997-2005.

Otherwise Cate, Esko G., and David W. Twigg. "Algorithm 513: Analysis of in-situ transposition [f1]." ACM Transactions on Mathematical Software (TOMS) 3.1 (1977): 104-110.

jonlawlor commented 9 years ago

As a temporary measure, we could also just use Dgemm to multiply by the identity matrix with the transpose flag set.

kortschak commented 9 years ago

Dgemm requires that C be allocated and different to A and B, and we then also need to allocate I. This doesn't help us.

jonlawlor commented 9 years ago

Yes, you're right. In the mean time, can we change the definition of Transposer from:

// A Transposer can create a transposed view matrix from the matrix represented by the receiver.
// Changes made to the returned Matrix may be reflected in the original.
type Transposer interface {
    T() Matrix
}

to

// A Transposer can transpose a matrix in place.
type Transposer interface {
    T()
}

Because I don't see how we can ever create a transpose view without some gnarly stuff behind the scenes.

kortschak commented 9 years ago

SGTM

btracey commented 9 years ago

SGTM

On Jan 1, 2015, at 7:40 PM, Dan Kortschak notifications@github.com wrote:

SGTM

— Reply to this email directly or view it on GitHub https://github.com/gonum/matrix/issues/37#issuecomment-68503746.

btracey commented 9 years ago

This was instead solved with the Transpose struct. Feel free to reopen if you do not believe so @jonlawlor

jonlawlor commented 9 years ago

Yes that definitely works.

This approach is interesting because it is a little matrix expression - it represents a transformation of a matrix as a type, instead of representing a literal matrix. By coincidence I've been working on a similar line of thought, inspired by some of the ideas in MadLINQ, and also my previous effort with representing relational algebra in go as a set of expressions. The code is still in a preliminary state, and I mostly intend it to be a proof of concept, but you can find it at jonlawlor/matrixexp.

The basic idea is to take the concept you're using in the Transpose struct and turn it up to 11. Instead of evaluating matrix operations, the library builds a graph of expressions, and I intend to write a set of expression compilers that can transform the matrix expressions in various ways, like the automatic use of blas.Transpose flags in DGEMM that you have here, but also more complicated operations like ordering a.Mul(b).Mul(c), and asynchronous computation, such as computing both sides of the addition in (a.Mul(b)).Add(c.Mul(d)) concurrently. The idea would be that the compilers would be run at init with a MustCompile method, so that optimizations could be amortized.

I think something a lot more ambitious lurks in there if this is carried even further, like they did in MadLINQ. I was going to mention it on gonum-dev when it reached a more mature state, but the bones of it are there, so feel free to comment & criticize.

kortschak commented 9 years ago

I'd love to have a look at that, but from what you describe it looks like you are heading in the direction that theano has gone, which is very cool. Note I saw a similar thing on proggit the other day.

btracey commented 9 years ago

Ordering more complicated expressions would be cool. The specific case of multiplying a bunch of matrices can be handled with a single method. (i.e. dense.MulGen(a Matrix ...)). It would be cool to see more complicated expressions like x^T Q^T A^-1 Q x be dealt with in a cool way. The inverse part is tough to fit into the normal paradigm.

Another cool use of composition is what they do in the cvx convex optimization library. CVX defines a set of convex kernels, and forces the user to compose these together to construct functions. This enforces that the user has defined a convex function where gradients and Hessians are known analytically.

jonlawlor commented 9 years ago

Yes this is going to be similar to theano, I think. I'll continue working on it and then write up an overview for gonum-dev.