Closed Tron-x closed 5 years ago
This uses stochastic power iteration with acceleration. I reference the paper this is based on in the Acknowledgements
section of the README. Stochastic power iteration uses a mini-batch Hessian estimate for each iteration.
when your code compute a dominant eigenvalue/eigenvector
in every step why you need to .apply() (apply has prepare_grad() mean a new batchsize data inputs)a new batchsize data to update the hessian_vec_prod(is it means that the hessian matrix is changed in this for iteration ,if do this?),as i know when we use power iteration to compute a matrix ,the matrix is fixed, your code seems that in the for iteration a new batch inputs will changed the hessian matrix ,i can not figure out your method ,may be it is another method , Looking forward to your reply~