tel / clatrix

A stupid name for a smart matrix library, because who doesn't love smart matrices?
MIT License
154 stars 29 forks source link

first, map, ... don't work correctly for 1 row matrices #36

Open alexott opened 11 years ago

alexott commented 11 years ago

if I have 2 matrices - row matrix & col matrix, then the seq, first, map, and other functions behaves the same for them, although this is incorrect (imho):

(def m2 (matrix [[1 2 3]])) m2 A 1x3 matrix

 1.00e+00  2.00e+00  3.00e+00 

(def m3 (matrix [1 2 3])) m3 A 3x1 matrix

1.00e+00 2.00e+00 3.00e+00

(first m2) 1.0 (first m3) 1.0 (matrix (seq m2)) A 3x1 matrix

1.00e+00 2.00e+00 3.00e+00 (matrix (seq m3)) A 3x1 matrix

1.00e+00 2.00e+00 3.00e+00

This breaks some functions in incanter that process matrices on per-row basis. I think, that this relates to issue #30

mikera commented 11 years ago

In core.matrix / generic array functionality the expected behaviour would be:

That is, first is always consistent with taking the first slice of the first (row) dimension.

alexott commented 11 years ago

Yes - I also thinking about this approach... I've tried to workaround this in to-list function, but that also requires changes on the Incanter's side.

mikera commented 11 years ago

I'm mildly trying to discourage first / map / seq etc as applied to matrices arrays BTW: they are convenient sometimes but won't work if/when we introduce new matrix types from the Java world that don't exactly fit Clojure's conventions. In particular, if a class implements Iterable in some way then that's the behaviour you will get, like it or not.

Better, I think, to move to the protocol-backed equivalents in core.matrix so we can guarantee behaviour that is both consistent and extensible.

alexott commented 11 years ago

we need to discuss this problem more deeply, as it will heavily affect Incanter's functions.

mikera commented 11 years ago

I know - it's a tricky one! The good news is that it doesn't have to be a big bang switchover, we can support both in parallel I think. Anyway, better to continue discussion on the relevant mailing lists.

whilo commented 9 years ago

I also have a related problem: boltzmann.jblas> (seq (matrix [[1 2] [3 4]])) ( A 1x2 matrix


1.00e+00 2.00e+00 A 1x2 matrix


3.00e+00 4.00e+00 ) boltzmann.jblas> (seq (matrix [[1 2]])) (1.0 2.0) ; seq of doubles

To make the single row case compatible with the other core.matrix/jblas routines, I had to convert the rows back to the matrix type explicitly (because ISliceWrapper of rows isn't countable and again not exchangable with the matrix type), which is somewhat ugly and adds probably some overhead (since all data has to pass through this copying code for each training epoch):

(map (comp mat/matrix vector)
          (rows v-probs-batch))

source I can optimize that probably, but it also took some time to figure it out and I had such problems also with the special core.matrix Vector type some times in the past, since most jblas routines expect matrices and the Vector type is not really compatible.

I understand the Iterable interface problem which probably b0rks the jblas types, so maybe if the ISliceWrapper object would behave like a Matrix I would be fine. I am not sure whether core.matrix.Vector is a good idea with jblas.

mikera commented 9 years ago

@ghubber I assume you mean the clatrix.core.Vector type? core.matrix doesn't have a Vector type specifically.

The clatrix.core.Vector type actually uses a JBlas matrix under the hood, so it should work fine for all the JBlas interop. Having said that, I think a lot of this code hasn't quite received all the testing it needs for all the interoperability cases. If anyone fancies doing some test.check generative testing that would be cool :-)

whilo commented 9 years ago

Ok, stupid question. Why do we need this Vector type then? The problem, if I recall correctly, was that the operations like Matrix-multiplication return 1-column/row Matrices, so when I loop the type changes.

mikera commented 9 years ago

Conceptually 1-D vectors are different from 2D matrices even if they have the same elements.

It's the same as the difference between [[1 2 3]] and [1 2 3].

Clatrix doesn't strictly need its own vector type. It could use Clojure vectors or something else in cases where it needs a 1D vector. The main requirement is that it is able to produce and consume 1D arrays where needed by the core.matrix API.

whilo commented 9 years ago

You are right ofc., I got confused by the introduction of a separate Vector type. I am just working on a core.matrix implementation for nd4j, because I want to have fast GPU based deep learning in Clojure. I am not sure yet how consistent the deeplearning4j code is (I know @mikera had some discussions about ndarrays with them), but I don't see the Clojure community reinvent all the work there atm.

In core.matrix / generic array functionality the expected behaviour would be:

first on a row matrix [[1 2 3]] returns a vector [1 2 3] first on a column matrix [[1] [2] [3]] returns a length one vector [1]

That is, first is always consistent with taking the first slice of the first (row) dimension.

Yes, that would be consistent iteration over ndarrays. A peculiar thing is to expect unboxed numbers when iterating over vectors, since this turns an Iterator<NDArrayThing> into Iterator<Object> in Java terms. But I really don't understand why the clatrix classes return a scalar on (first (clatrix/matrix [[1 2]])), since Clatrix wraps JBlas' DoubleMatrix.

Also couldn't you get the Iterable added in upstream JBlas to avoid wrapping? (for opensource libs this is at least possible). This makes seamless interaction with APIs written for JBlas possible and I try to get this for nd4j (maybe it is stupid?).

https://github.com/deeplearning4j/nd4j/pull/374

DoubleMatrix does not implement Iterable yet and can also store vectors. Why do you need a separate vector type? In general ISeq should be a protocol like in cljs, then we wouldn't have all that trouble, but this is a whole new story...

(I think an Iterator over elements for a tensor with dimensions is really weird and should be part of the API, not exposed by Iterable for any Java vector lib. Just my 2 pence.)

mikera commented 9 years ago

@whilo an nd4j core.matrix implementation would be great!

However did you take a look at Vectorz / vectorz-clj? I think that implementing GPU support for Vectorz (via netlib-java) would actually be a pretty good way to get GPU matrix support in Clojure. And Vectorz has some advantages:

I actually started an implementation here and it seems to work:

https://github.com/mikera/vectorz-native

Anyway I'd be interested to compare performance of the nd4j based implementation with vectorz-clj, something along the lines of http://core.matrix.bench-t2.large.s3-website-eu-west-1.amazonaws.com/554c2ae93f357522cca7e383e7ad90fef451c139.html

mikera commented 9 years ago

@whilo have you got a public repo yet for your nd4j core.matrix implementation? I'm interested to take a look and try it out!

whilo commented 9 years ago

It was only a weekend hack, because I would like to have competitive Clojure implementation to theano: https://github.com/whilo/clj-nd4j

I stopped working on it once I figured out I had to wrap the ndarray class. But since this is out of the way now, I would try to get the compliance tests passing.

whilo commented 9 years ago

For deeplearning the absolute bottle-neck in my experience is matrix multiplication between batches of training samples and weight matrices. What do you mean by

1D vector maths, which don't really benefit much from GPUs but are very central to deep learning techniques

?

mikera commented 9 years ago

I've done quite a bit of deep leaning using only 1D vectors and sparse operations (which don't require full matrix x matrix multiplication). Agree that matrix multiplication will be the bottleneck if you are using big dense matrices, but that isn't always required.

mikera commented 9 years ago

@whilo thanks for sharing! I'll take a look and see if I can get the compliance tests passing. You shouldn't need to wrap the ND4J INDArray, having taken a quick look I think it has all the functionality required to work as an N-dimensional array.

whilo commented 9 years ago

Good, thank you! I will have a look into the issue you pointed out in the pull-request. Most models I have seen so far have dense matrices. Which models have you trained? Backprop then only needs to update non-zero weights so matrices stay sparse, right?

mikera commented 9 years ago

Correct. It is a very efficient way to have a lot of feature detectors, but without exploding the number of weights.

whilo commented 9 years ago

Interesting. I have used dropout a bit, which is really nice regularization. Do you set the sparsity at the beginning of training or is it adaptive by pruning small weights and speculatively "forming synapses". (I work on biologically inspired neuron models and port deep learning techniques for my master, Boltzmann machines so far, to them.) In biology synapses form all the time.

mikera commented 9 years ago

I've played a bit with both..... seems that a fixed sparsity works fine since the nodes just specialise towards whatever they have available as input. The pruning / forming new synapses also seems to work and performs better in some cases but not sure it it is always worth the extra complexity and overhead.

Of course, if you are doing stuff like convolutional networks it really helps to have a sensible hypothesis about how nodes should be connected in advance.

whilo commented 9 years ago

Back to the original issue, do you think this is fixable in Clatrix? We probably should not discuss this here :).