Open danielkorzekwa opened 8 years ago
this function implements the functionality described above, but is it consistent with the breeze idioms/syntax patterns?
i think "filter matrix by row" refers to filtering rows from a matrix based on value(s) in a given column; so the function below takes three arguments: the DenseMatrix, a column ID, and a filter function and returns a DenseMatrix whose rows are some subset of the one passed in. Therefore, i named this function filterRows rather than filterByRow
what's more while Daniel describes using a bit vector to extract rows from a DenseMatrix, this vector (the vector returned by calling findAll) is not a bit vector but an integer vector of row numbers. (In other array libraries i have used, eg, NumPy, Julia, R, the vector returned would indeed be a bit vector). I assume this doesn't matter because it seems the primary interest is getting a subset of rows returned by specifying a column in the original matrix and a filter function to process it.
lastly, how should it be made generic? Clearly breeze uses spire, but not extensively (usually just cfor); other DenseMatrix and DenseVector methods are @specialized, but what is the current breeze standard for parameterized types?'
// spire to make the function generic:
import spire.implicits._
import spire.math._
import spire.algebra._
def filterRows[A:Numeric](M:breeze.linalg.DenseMatrix[A], colId:Int, f:A => Boolean) = {
val col = M(::,colId)
val idx = col.findAll(f)
M(idx,::).toDenseMatrix
}
call it like so:
val m = DenseMatrix((5, 7, 8), (7, 1, 3), (6, 7, 4))
def f(x:A):Boolean = (x < 6)
filterRows(m, 1, f)
happy to help any way i can.
Hi! I actually just issued a pull request which addresses this functionality (among other things). It allows for consistent slicing operations across Vector, DenseMatrix, and SliceMatrix. If accepted you will be able to use a BitVector in either the rowSlice, colSlice or both.
The use case here is to filter matrix by some column. Currently, first I use findAll on a particular column and then I filter matrix like this(see below), but the above would be more compact.
Ideally I would like to filter matrix by row like this (or something like that):