Closed noahgolmant closed 4 years ago
(a) and (c) solved by #26
From my experiences (b) seems to be heavily dependent on architecture, but overall I've been getting stable estimates on ResNet18 with as few as 10 steps, so I'm going to call it good for now
(a) Test hessian computation on small networks where we can use np.linalg.eig to test against. Single hidden layer should be fine. (b) Measure convergence rate vs. batch size, power iteration steps. Inject noise to see how variance affects this. Change width of hidden layer, too. (c) Test averaging the eigenvalue estimate to see if this helps, too.
Should address #17