half Hessian diagonal instead of doubling diagonal in SPSAHessian

zapata-engineering / orqviz

Python package for visualizing the loss landscape of parameterized quantum algorithms.

Apache License 2.0

84 stars 18 forks source link

half Hessian diagonal instead of doubling diagonal in SPSAHessian #57

Closed MSRudolph closed 1 year ago

MSRudolph commented 1 year ago

Reworking a change from this PR. After comparing with automatic differentiation hessians, the SPSA hessian implementation was correct, but the full hessian implementation had a doubled diagonal.

mstechly commented 1 year ago

@MSRudolph Is there a test case you could add to catch this issue in future if someone makes a change which will break it again?

MSRudolph commented 1 year ago

Hi @anilagca, thanks for your release. However this PR contains a fix that we should release ASAP. @mstechly We could write a test where we know the Hessian analytically. I can work on that soon.

mstechly commented 1 year ago

Hey @MSRudolph ! I added some preliminary function for testing.

If you could see if this makes sense as a test case and if not, edit it so it does, that would be great. I understand that calculating things at (0,0) might be too much of a trivial/special case.
I noticed that the approximate Hessian calculation has terrible precision. Did you know about it / is it something that we should be concerned about?

MSRudolph commented 1 year ago

Hey @MSRudolph ! I added some preliminary function for testing.

If you could see if this makes sense as a test case and if not, edit it so it does, that would be great. I understand that calculating things at (0,0) might be too much of a trivial/special case.

I noticed that the approximate Hessian calculation has terrible precision. Did you know about it / is it something that we should be concerned about?

Thanks for picking this back up, @mstechly.

I think it would be best to pick a simple sin/cos function for which we can calculate the Hessian with pen and paper. It also makes the computational cost almost non-existent.
The stochastic approximation of the Hessian is very inprecise, yes. The error per entry scales with 1/sqrt(M) where M is the number of reps. If the problem is simple enough, one could actually do all the reps necessary to get down to a certain precision.

mstechly commented 1 year ago

@MSRudolph I've update the tests and added a cost function and an analytical Hessian. I made the function a bit more non-trivial. If you say this looks good, I think we can merge. @max-radin – something seems to be wrong with CICD, or perhaps you need to trigger it manually? I don't know, but please take a look.

MSRudolph commented 1 year ago

This looks great, thanks :)