The main speedup will come from the CACG log_pdf computation (~1.5x). Note that the original implementation also had the optimal path contraction, but this optimal path was being computed at every iteration, which is very time-consuming (see here). Instead of doing this, we now fix the optimal path.
Since we roughly know the shapes of the tensors, we can fix the einsum_path instead of computing the optimal path each time.
Using the optimal path gives the following FLOP speedup in the CACG
_log_pdf
:The main speedup will come from the CACG log_pdf computation (~1.5x). Note that the original implementation also had the optimal path contraction, but this optimal path was being computed at every iteration, which is very time-consuming (see here). Instead of doing this, we now fix the optimal path.