Closed MatteoRobbiati closed 19 hours ago
The performances are not going to be drastically different, they are backed by the same (or almost the same) linear algebra, and compiling the graph doesn't take that much.
I'd not consider performances as the discriminator in what to choose.
We should benchmark the performances of automatic differentiation provided by the three frameworks (TF, PyTorch, Jax) with the case of keeping these only as frontend while using only one of them to compute the symbolic gradients.