vkostyukov / la4j

Linear Algebra for Java
http://la4j.org
Apache License 2.0
373 stars 108 forks source link

An internal GC for sparse entries #87

Open vkostyukov opened 11 years ago

vkostyukov commented 11 years ago

A colleague told me one interesting thing that I can use in la4j's sparse enties (matrices and vectors). To support the idea of lazy clearing zero cells by implementing an internal GC for sparse entries. By other words, it doesn't sound efficient to clear a cell each time the the user puts a 0.0 there (how it realy works now). It might be a good idea just put a real zero in in the cell, without any shifts/removing and contunue working. When the cardinality reaches some threshold value we can start reclaiming process and clear all zeros. This will give us a great amortized performance.

SamoylovMD commented 11 years ago

The current implementations of some methods, i.e. eachNonZero, need all non-zero values in sparse entry. Are you sure it is worth while?

vkostyukov commented 10 years ago

Right. With such approach we should be able to handle these intrussive zeros in iterators and nonZero functions. I'm marking this as a question and moving to the next milestone.

sylvia43 commented 9 years ago

How about a gc() method in SparseVector and SparseMatrix? That way the user (or the library) can manually clean it up.

SamoylovMD commented 9 years ago

I have some idea about GC: instead of gc() in SparseVector/SparseMatrix we could implement common gc() in LinearAlgebra. So, we have to keep links to any SparseVector and SparseMatrix in LinearAlgebra (i.e. in List) and iterate on these links when we want to clean up our sparse objects. Possible implementation:

//in LinearAlgebra:
public static List<SparseVector> sparseVectorCache;
public static List<SparseMatrix> sparseMatrixCache;
...
public static void gc() {
    for (SparseVector v : sparseVectorCache) {
        v.clean();
    }
    for (SparseMatrix m : sparseMatrixCache) {
        m.clean();
    }
}

when clean() is a method to clean up sparse entry from zeros.

sylvia43 commented 9 years ago

Sounds good. We'd add them to the cache in the constructor? Or is that already being done?

vkostyukov commented 9 years ago

The main issue here is that we might end up rewriting all non-zero routines like nonZeroInterator, eachNonZero and foldNonZero. In fact, we should be able to skip zero elements in those methods, which sounds like a runtime overhead by an additional if check on every iteration.