fbreuer / analytic-feature-selection

Paper: Analytic Feature Selection for Support Vector Machines
0 stars 0 forks source link

check whether zero vector is in affine hull of rows #2

Open fbreuer opened 11 years ago

fbreuer commented 11 years ago

To calculate the dimension of the affine hull, we need to check whether the origin is in the affine hull of the rows. This is currently implemented in zero_vector_present in

https://github.com/stambizzle/thesis/blob/master/thesis_code/general_feature_selection.py

However zero_vector_present checks only if one of the rows is zero. I will fix this.

fbreuer commented 11 years ago

Checking whether the zero is in the affine hull of the rows involves at least one more rank computation. However, it is not necessary to compute two ranks to compute the dimension of the affine hull. Instead, everything can be done with a single rank computation as follows.

Let v_1, ..., v_d denote the rows of the matrix. The dimension of the affine hull of the v_i can be computed via

dim(aff(v_1,...,v_d)) = rank(v_1-vd, ..., v{d-1}-v_d),

that is, we just need to compute the rank of the matrix with rows v_i - v_d for i=1,...,d-1.

Of course I don't know how cumbersome it is to create this matrix of differences in python. But if we were to implement the zero_vector_present function properly, we would need to create such a matrix anyway.

stambizzle commented 11 years ago

So, just to be sure, you are suggesting that we get rid of the zero_vector_present function altogether and rewrite the get_poly_dim function using the above logic instead?

That should not be too difficult to implement. I will mess around with it a bit later this afternoon.

fbreuer commented 11 years ago

Yes, exactly. Rewrite get_poly_dim to use just one formula instead of two cases and get rid of zero_vector_present altogether.

stambizzle commented 11 years ago

here:

https://gist.github.com/759ae510185602e8d842

fbreuer commented 11 years ago

Looks good! Modulo the fact that I don't know some numpy conventions:

1) Why do we need to transpose?

The rows of mat are correspond to the samples, right? And numpy indexes by rows, not columns, right? Then mat[-1] would mean the last row and mat[:-1] would mean everything except the last row. If this is correct, we do not need to transpose.

2) Vector matrix subtraction

If k is a matrix and d a vector, is k-d the matrix where d is subtracted from every row? If yes, the code is correct.

stambizzle commented 11 years ago

I misread your explanation. I am so used to v_i's being columns that I just read it that way, even though it clearly says rows:)

In that case, we don't need to transpose.

Yes, k - d will subtract the vector d from every row in the matrix k, as seen here: https://gist.github.com/e174992b1a18681c3343