Closed tingwl0122 closed 1 month ago
Tested on influence_function_noisy_label task on mnist+lr
Checked Data Sample Found flipped Sample
--------------------------------------------------
0 0
100 69
200 100
300 106
400 108
500 108
600 108
700 108
800 108
900 108
We may remove dattri.func.hessian.ihvp_arnoldi
and dattri.func.hessian.ihvp_at_x_arnoldi
later? Currently only the old IFAttributor is using it and we are defining arnoldi as a projector now. @jiaqima
We may remove
dattri.func.hessian.ihvp_arnoldi
anddattri.func.hessian.ihvp_at_x_arnoldi
later? Currently only the old IFAttributor is using it and we are defining arnoldi as a projector now. @jiaqima
Good point! I think we could remove them when we get rid of the old IFAttributor.
Another point is that we might have to handle negative eigen values if we want to do this sqrt thing. We can definitely cope with complex numbers but it might be a bit messy.
Another point is that we might have to handle negative eigen values if we want to do this sqrt thing. We can definitely cope with complex numbers but it might be a bit messy.
How about we project to min(proj_dim, # positive eigvals)? If # positive eigvals < proj_dim, throw out a warning about that. In this case, we are always projecting to eig space with positive eig vals. I believe in practice, # positive eigvals should usually be larger than proj_dim?
Another point is that we might have to handle negative eigen values if we want to do this sqrt thing. We can definitely cope with complex numbers but it might be a bit messy.
How about we project to min(proj_dim, # positive eigvals)? If # positive eigvals < proj_dim, throw out a warning about that. In this case, we are always projecting to eig space with positive eig vals. I believe in practice, # positive eigvals should usually be larger than proj_dim?
I am pretty unsure about this, I indeed use quite large proj_dim (close to param size actually) during some experiments during dattri rebuttal. now the newest version can handle complex numbers with minimal modification.
Another point is that we might have to handle negative eigen values if we want to do this sqrt thing. We can definitely cope with complex numbers but it might be a bit messy.
How about we project to min(proj_dim, # positive eigvals)? If # positive eigvals < proj_dim, throw out a warning about that. In this case, we are always projecting to eig space with positive eig vals. I believe in practice, # positive eigvals should usually be larger than proj_dim?
I am pretty unsure about this, I indeed use quite large proj_dim (close to param size actually) during some experiments during dattri rebuttal. now the newest version can handle complex numbers with minimal modification.
I'm afraid that this would mess up some subsequent computation as everything is complex type after the projection?
Another point is that we might have to handle negative eigen values if we want to do this sqrt thing. We can definitely cope with complex numbers but it might be a bit messy.
How about we project to min(proj_dim, # positive eigvals)? If # positive eigvals < proj_dim, throw out a warning about that. In this case, we are always projecting to eig space with positive eig vals. I believe in practice, # positive eigvals should usually be larger than proj_dim?
I am pretty unsure about this, I indeed use quite large proj_dim (close to param size actually) during some experiments during dattri rebuttal. now the newest version can handle complex numbers with minimal modification.
I'm afraid that this would mess up some subsequent computation as everything is complex type after the projection?
we just need to convert it back to float since there will be no imaginary part after inner product. So I guess as long as we are still under BaseInnerProductAttributor
, then we should be ok.
Now we will throw a warning if proj_dim
is larger than the number of positive eigvals and will automatically adjust proj_dim
to be equal to this number if that happens. Test case will be adjusted accordingly.
Description
1. Motivation and Context
Re-define the arnoldi projector as 1.0 / sqrt(eigvals) eigvecs.T features
2. Summary of the change
arnoldi_project
transform_train_rep
inside arnoldi attributordattri/benchmark/datasets/mnist/mnist_mlp
3. What tests have been added/updated for the change?