mikera / core.matrix

core.matrix : Multi-dimensional array programming API for Clojure
Other
702 stars 113 forks source link

Comparison operations #82

Open mikera opened 10 years ago

mikera commented 10 years ago

We need a set of functions for elementwise array comparisons.

Thoughts:

Potential functions:

anthgur commented 10 years ago

I am somewhat new to Clojure, but I would like to work on this enhancement. I plan to print out the Clojure CA at work tomorrow and send it in when I am able to make it to a post office. I saw a suggestion to look here as a place to contribute via the mailing list.

Can you please expand on the requirements for compare and clamp a bit? I think I generally understand the rest of them. I'm not sure whether I would be better off tackling this or the sorting enhancement. It seems like this would be an easier one to start with.

mikera commented 10 years ago

Great stuff - glad to have your contributions! This is probably easier than sorting, at least the API is simpler / more clearly defined.

compare is sort of like the array equivalent of java.lang.Comparator. In fact, I think the folowing should be equivalent:

i.e. the compare result is just the sign of subtracting the second argument from the first

clamp is supposed to restrict each element in an array to lie between an mininum and maximum value. In fact the following array operations should be equivalent

Here V is the value to be clamped, A is the minimum bound and B is the maximum bound.

The default implementation for both compare and clamp can probably just use the definitions above, i.e. use the underlying protocols for min, max etc.

anthgur commented 10 years ago

That makes more sense now, thanks! Is signum already written?

If not I would have to research more into how to write a procedure for it. After a quick search I found this snippet from a book online. Section 5.3 defines Newton iteration which is something that I think I have seen before. Is that approach to the sign function adequate? It looks like it is not a catch-all.

For sub, would I use the sub function in core.matrix or directly use the underlying protocols (same for min and max)?

mikera commented 10 years ago

Yes, signum is already there.

You'll probably need to use the underlying protocols for all of these, since Clojure doesn't allow circular dependencies and the core.matrix API is defined after the default implementations at present.

mikera commented 10 years ago

Hmm thinking about it it might be worth implementing a "element-wise if" function eif alongside these. This would be a conditional element-wise check that would interpret (x > 0) as true and (x <= 0) as false, so that you could do stuff like:

(eif (le A B) C D)

i.e. test if the element in A is less than or equal to the element in B, and if true use the element from C, else use the element from D.

eif is probably the most fundamental operation: I think all of the others can be implemented in terms of it.

mars0i commented 10 years ago

Fwiw, I like the eif, min/max, and clamp ideas. I already do this sort of thing using emap, but it seems reasonable to have dedicated functons.

anthgur commented 10 years ago

If I understand correctly, eif will take two matrices [A B] as parameters and return a boolean matrix?

For example: (Aij > Bij) true (Aij <= Bij) false For every element in A/B.

So A and B have to have the same shape? It's been a while since I've worked with matrices. Are all of the boolean comparison functions the same general idea?

anthgur commented 10 years ago

I read through your comment again @mikera and I think I misunderstood.

If I understand correctly now, eif should act more like this:

A = [[1 2] [1 2]]
B = [[1 2] [0 0]]
C = [[1 2] [3 4]]

(eif (eq A B) B C) = [[1 2] [3 4]]

Is this the behavior you are looking for with compare?

(defn compare 
  [a b]
  (mp/element-map (mp/matrix-sub a b) #(int (mops/signum %))))

(compare [[10 10] [20 20]] [[20 20] [10 20]])
[[-1 -1] [1 0]]

I've learned a lot about Java interop just by looking through this repository, it's great stuff. Is there more I should do in these functions in terms of input validation?

mikera commented 10 years ago

Perfect - that's the exact behaviour I would expect from both functions.

anthgur commented 10 years ago

A couple of things: ae8a1d3112300c644205d880648bbb6fd8a2e924

I managed to find the right files to put my new code, but I'm not sure where it should go in those files. Am I supposed to attempt to overwrite the compare function provided in clojure.core?

Edit: Here's a better error message https://travis-ci.org/anthonyurena/core.matrix/jobs/16327779#L823

What should I do with the rest of the functions? Should I place them all in a single protocol or split them between multiple?

I was thinking about having a PComparison protocol to house these.

mikera commented 10 years ago

PComparison protocol sounds good for the comparison functions. The min max, clamp functions probably should go in another protoocl - PMinMax perhaps?

Hmmm... I'd forgotten that compare is in clojure.core. I've been trying to avoid core name clashes as it makes useing the core.matrix namespace a pain. The options are:

FWIW, le, 'gtetc. could have<=,>synonymns inclojure.core.matrix.operators`

anthgur commented 10 years ago

I renamed compare to cmp to get the build to pass. I can change it again if that is too ambiguous. What should be the return value of the boolean operators? A boolean array or something different? Example: (le [1 2] [2 1]) => [true false] or [1 0]

max and min are also in the clojure.core namespace, I think those should probably the operators namespace. What do you think? It looks like maximum and minimum will work too.

mikera commented 10 years ago

cmp sounds fine.

boolean return values are a slightly tricky one.

I'm tempted to say that the return values should be 0 / 1 - that way they are usable as masks with multiplication etc. It would also be consistent with an eif that interprets positive values as true. And last but not least, these are allowable values for numerical matrices whereas true / false are not.

However, booleans could also have advantages (integration with regular Clojure code etc.).

I'll fire out a question on the Numerical Clojure list to see if anyone has good arguments either way

keithschulze commented 9 years ago

I think this would be a very useful enhancement, is there still interest in having it implemented? I’m relatively new to Clojure and core.matrix, but I’d be keen to work on this. Following the suggestions above, I have done initial implementations for cmp, clamp, eif, lt, le, gt, ge, ne & eq. I will submit a pull req shortly. The implementations are fairly naive, so I would appreciate some feedback. In particular, I didn't follow when you mentioned that eif could be used to implement all the others.

mikera commented 9 years ago

Thanks for the PR! I've merged it because it looks mostly good though I think still needs some work (on docstrings, edge cases etc.)

keithschulze commented 9 years ago

Great thanks! I agree that docstrings and edge cases still need some work.