cmu-phil / tetrad

Repository for the Tetrad Project, www.phil.cmu.edu/tetrad.
GNU General Public License v2.0
402 stars 110 forks source link

Get rid of unnecessary matrix libraries. #53

Closed jdramsey closed 8 years ago

jdramsey commented 8 years ago

This is a wish-list item, maybe doable. Currently we are using the following matrix libraries:

    <dependency>
        <groupId>org.apache.commons</groupId>
        <artifactId>commons-math3</artifactId>
        <version>3.5</version>
    </dependency>
    <dependency>
        <groupId>colt</groupId>
        <artifactId>colt</artifactId>
        <version>1.2.0</version>
    </dependency>
    <dependency>
        <groupId>gov.nist.math</groupId>
        <artifactId>jama</artifactId>
        <version>1.0.2</version>
    </dependency>
    <dependency>
        <groupId>com.googlecode.matrix-toolkits-java</groupId>
        <artifactId>mtj</artifactId>
        <version>1.0.1</version>
    </dependency>

Much of this is overlapping functionality. We have a class, TetradMatrix, that wraps the Apache matrix library. Can we remove some of the other matrix libraries and use TetradMatrix instead?

jdramsey commented 8 years ago

Well, for this one:

    <dependency>
        <groupId>com.googlecode.matrix-toolkits-java</groupId>
        <artifactId>mtj</artifactId>
        <version>1.0.1</version>
    </dependency>

it's in Ling, Lingam, KernelUtils, and IndTestHsic. The only missing functionality in TetradMatrix is matrix square root, which can be added.

Matrix square root can be calculated using SVD, which Apache has. So we can get rid of mtj.

jdramsey commented 8 years ago

Jama, this one:

<dependency>
    <groupId>gov.nist.math</groupId>
    <artifactId>jama</artifactId>
    <version>1.0.2</version>
</dependency>

is used mainly for its SVD and EVD decompositions. It's also used in a couple of other places that can easily be translated into TetradMatrix.

Apache has these decompositions; they can be substituted directly. So we can get rid of Jama.

jdramsey commented 8 years ago

The main distinguishing feature for colt, this one:

<dependency>
    <groupId>colt</groupId>
    <artifactId>colt</artifactId>
    <version>1.2.0</version>
</dependency>

is that it can abstract views matrices and operate on them. There are some places in the code where this is heavily used. Not sure if we want to implement that in TetradMartrix with another type of matrix as backing, but we could. Need a high pain threshold though.

jdramsey commented 8 years ago

This one is now gone:

<dependency>
    <groupId>com.googlecode.matrix-toolkits-java</groupId>
    <artifactId>mtj</artifactId>
    <version>1.0.1</version>
</dependency>

See Issue #59.

jdramsey commented 8 years ago

This one I apparently got rid of too:

<dependency>
    <groupId>gov.nist.math</groupId>
    <artifactId>jama</artifactId>
    <version>1.0.2</version>
</dependency>

In any case, there is no trace of it anymore. It was only being used for matrix decompositions, all of which existed in the Apache library.

jdramsey commented 8 years ago

So now we're down to:

`

org.apache.commons
    <artifactId>commons-math3</artifactId>
    <version>3.5</version>
</dependency>
<dependency>
    <groupId>colt</groupId>
    <artifactId>colt</artifactId>
    <version>1.2.0</version>
</dependency>

`

Ideally we'd get rid of Colt too and just use Apache. That may take some doing.

jdramsey commented 8 years ago

The problem is that the Colt library is much easier to use for translating Matlab code, cumbersome though it is, since it allows views of matrix rows, columns, subsections. I suppose we could easily build these for TetradMatrix. I think that would be a prerequisite. Classes or method that have no references in the code we can delete, which will help some.

jdramsey commented 8 years ago

It seems we should leave COLT in the project. It is faster than Apache and makes it easier to translate Matlab code. OK, two matrix libraries. Closing.