Closed Michael-T-McCann closed 2 years ago
IIRC it’s a problem with the astra algorithm itself.
I want to say that the forward projector is always pixel based, but the GPU adjoint is ray based? Don’t quote me on that.
So if it is due to the inaccuracy of the astra projector, should we increase the tolerance of the test or just disable them for gpu?
To clarify: we know there is at least a problem with the ASTRA GPU backprojection algorithm itself. The question here why is the new adjoint test failing with a relative error approaching one. I didn't think the ASTRA backprojection was that inaccurate, but I don't have a good intuition for our new adjoint test.
If you are convinced there is no bug in the interface or test, I learn toward increasing the tolerance rather than disabling the test, because it might allow us to catch regressions in the interface in the future.
rel err of 1 is larger than i'd expect; iirc it was something like 10%?
has the astra version changed? it looks like they finally shipped v2 (was on 1.9.9 beta for a looong time)
has the jax/jaxlib version changed? might be worth seeing if something has been modified in the host_callback interface; it is marked as "experimental" and subject to breaking changes
Is the adjoint test new? Did it ever pass on GPU for astra?
No, it isn't new. It passed with very, very loose tolerances; like a relative error of 20% or something.
Sounds like that's no longer the case (and I have no gpu machine to test)
I did pytests on old commits to see where it broke on GPU. It seems the tests pass on commit d86b51f and fail after commit 656f59d (Improve LinearOperator adjoint tests (#72)).
That commit (656f59d) seems to be the one where a new adjoint test (adjoint_test
) was introduced. The old adjoint_AAt_test
seems to pass on GPU.
The new test seems to be testing the adjoint accuracy in a different way than the old test. I don't understand the tests well enough to comment on why one passes and the other fails. Any ideas?
Seems clear that what is needed is just a looser tolerance. The tolerance values for the new and old tests mean different things, so this is not too surprising.
On Thu, Nov 18, 2021, 14:29 Soumendu Majee @.***> wrote:
I did pytests on old commits to see where it broke on GPU. It seems the tests pass on commit d86b51f https://github.com/lanl/scico/commit/d86b51f764ba3641354ae255421d43fdd514d0e7 and fail after commit 656f59d https://github.com/lanl/scico/commit/656f59d784b46c9dcc0e4b888443aa4f836ad1f8 (Improve LinearOperator adjoint tests (#72 https://github.com/lanl/scico/pull/72)).
That commit seems to be the one where a new adjoint test (adjoint_test) was introduced. The old adjoint_AAt_test seems to pass on GPU.
I don't understand the tests well enough to comment on where why one passes and the other fails. Any ideas?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/lanl/scico/issues/103#issuecomment-973286954, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANUBO7CNQY46JLAD4H4E7JTUMVV4RANCNFSM5IFZBO5Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
That sounds like a reasonable solution to me.
It seems currently all the astra tests use the same tolerance irrespective of the test. I found that I can make the test pass by making the tolerance of adjoint_test
15 times. I have pushed the change in a (draft) pull request. Let me know your comments on the pull request.
The ASTRA Toolbox GPU back projector is known to not be a highly accurate adjoint to the forward operator. In the old days, we handled this with a generous tolerance on our adjoint test. After some improvements to that adjoint test, the ASTRA back projector fails very hard (relative error 0.78).
The question: is this a problem with the test, our interface to ASTRA, or ASTRA itself? If it's the latter, let's just skip these tests on GPU.
First step: get a GPU install of SCICO and run
pytest scico/test/linop/test_radon_astra.py
to see if the failures occur for you.