Is auto-differential the same as adjoint method?

Thanks for the question! The adjoint method is equivalent to 'reverse-mode' automatic differentiation, which is the default gradient calculation in ceviche. Some of our earlier packages, such as angler define the adjoint explicitly, but this is quite tedious, especially for complex problems. In the automatic differentiation approach, we hard code the adjoint for each of the elementary operations and then at run time, the program constructs one large adjoint problem behind the scenes based on how we've called each of these operations in our code.

To give a more specific example, when we import the numpy package, which provides most of the elementary operations, we actually import a numpy wrapper from the autograd package. In autograd, the adjoint equations for most numpy operations are pre-defined in the source code. In ceviche, we've also added the adjoint equations dealing with solving sparse linear systems (solve Ax=b for x) that are needed for the frequency domain solvers. These are the ones most commonly seen in EM simulation papers talking about adjoint method. When a user wants to construct the gradient of an objective function that uses an electromagnetic field solution, the code records each operation that goes into this calculation and the adjoint equation for each is looked up and used to construct a large 'computational graph', which essentially specifies the large 'adjoint equation' for this objective function. When we evaluate the gradient at a specific set of parameters, we plug the values for the forward pass into this adjoint equation that has been automatically reconstructed, which gives us our actual gradient.

While the automatic differentiation approach makes things much easier and simpler from a programming standpoint, the physical interpretation of the adjoint problem (for example, the new source) is still there, it's just buried deeper in the code. As an example, here is where we define the adjoint equation for solving x in Ax=b. If we were differentiating a function f(x) then v in this line would be partial f / partial x. You can see that, just like in the typical adjoint case from electromagnetics, the adjoint problem involves solving the same system (now transposed) where the right hand side is replaced by - partial f / partial x = -v. So the physics of the adjoint source is still valid, but its just abstracted into these low level adjoint 'primitives'.

For more detail, I'd recommend a recent paper from our group that discusses the connection between automatic differentiation and adjoint method more deeply, and in the context of photonic crystal design. The paper is here.

Hopefully that gives some more context on the connection between adjoint and automatic differentiation. TLDR: they are the same thing, but with automatic differentiation, we just specific the adjoint for each individual operation and the program figures out how to combine these together to construct one big adjoint equation for the whole problem without needing a human to derive all of that by hand.

fancompute / workshop-invdesign

Is auto-differential the same as adjoint method? #3