amirgholami / PyHessian

PyHessian is a Pytorch library for second-order based analysis and training of Neural Networks
MIT License
694 stars 119 forks source link

computational (time) cost #6

Closed xinyueshen closed 4 years ago

xinyueshen commented 4 years ago

Thanks for sharing this very interesting package! I'm trying to use it on some very simple objective functions such as \frac{1}{N}\sum_{i=1}^N\log(x_i^T \theta + \epsilon), but the time cost seems to be high. The dimension of the variable \theta is about 100, and the number of samples N is about 1e6. To get the top 50 eigenvalues, it took about 45 seconds on a GPU. Could you please give some comments on whether such timing is as expected? Thanks!

yaozhewei commented 4 years ago

Hi Xinyue,

Yes, it is expected. The top50 eigenvalues will be very cost using hessian.eigenvalues(). Since your problem is only about 100, I would suggest you use hessian.density(), set the niter to be exactly like your problem dimension, that should be finished very fast and give you the distribution of the full eigenvalues.

xinyueshen commented 4 years ago

Thanks for the response! But does hessian.density() compute and return the eigenvectors at the same time?

xinyueshen commented 4 years ago

I see that hessian.density() computes the eigenvalues, not the eigenvectors, but it is much faster. Thanks!