kohpangwei / influence-release

MIT License
777 stars 175 forks source link

why does inverse_hvp / scale after the iteration in get_inverse_hvp_lissa #20

Closed cyyever closed 4 years ago

cyyever commented 4 years ago

From #4 , I can understand why to use scale in HVP since the real loss function is scaled. But I don't know why to use scale at the end on line 507-510.

kohpangwei commented 4 years ago

In the loop, we scale the Hessian down by scale, which means that the estimate of the inverse Hessian-vector product will be scaled up by scale. The last division corrects for this scaling.