jblas-project / jblas

Linear Algebra for Java
http://jblas.org
BSD 3-Clause "New" or "Revised" License
590 stars 149 forks source link

broken equals / hashCode contract #43

Closed m0nzderr closed 9 years ago

m0nzderr commented 10 years ago

Hi, this is realy great lib. I like it so much, especially for almost-"matlab" comfort. But there is one annoying thing: bad implementation of .equals(). Thing is, that current equals() has several potential problems: a) a comparison with a hard-coded tolerance (hey, what if 1e-6 is the NOT a precision I seek?) b) NaNs are just ignored ( [1 2 3] equals [1 NaN 3]) c) equal matrices according to such implementation, will have different hashCodes - and this is a realy bad news for those who is going to use jblas objects in complex systems with caching, hashmaps. Things may go really bad for them.

AFIK, from A.equals(B) in Java we expect A.hashCode()==B.hashCode(), while opposite is not necessarily true.

Since jblas does not provide interfaces or abstract classes (for performance reasons, I guess) there is no elegant way to fix .equals() by simple overriding. So, my proposal is the following:

1) refactor current implementation of equals(Object o) into e.g. compare(SomeMatrix other, double tol)

2) implement equals(Object o) based on exact value (=byte level) comparison (e.g, using Arrays.equals(this.data,other.data).

Thanks!

srowen commented 10 years ago

I think this may have been addressed in https://github.com/mikiobraun/jblas/commit/2bfb810d67a7b08c69476a90cc7889e9fe46b27d ? Although the NaN issue is possibly still oustanding -- it's actually the opposite, that NaN is not equal to itself.

m0nzderr commented 10 years ago

It looks like it was not addressed yet, since current master has this code (DoubleMatrix.java):

   /**
     * Compare two matrices. Returns true if and only if other is also a
     * DoubleMatrix which has the same size and the maximal absolute
     * difference in matrix elements is smaller thatn 1e-6.
     */
    @Override
    public boolean equals(Object o) {
        if (!(o instanceof DoubleMatrix)) {
            return false;
        }

        DoubleMatrix other = (DoubleMatrix) o;

        if (!sameSize(other)) {
            return false;
        }

        DoubleMatrix diff = MatrixFunctions.absi(sub(other));

        return diff.max() / (rows * columns) < 1e-6;
    }

Note, that NaN!==NaN is absolutely right, but only from the algebraic standpoint, so Matrix.compare(Matrix) should rely on this fact as well. But within context of data representation, Double.NaN is a certain sequence of bytes, so Matrix.equals(Matrix) must to be able to distinguish between NaNs and any other values.

mikiobraun commented 9 years ago

Ok, I fixed this as suggested. There's a new function compare now (with tolerance), and equals uses Arrays.equals on the byte arrays.