There's still a lot that could be done here, but this turned out to be fairly complete as a zero cost compile-time abstraction for Einstein summation.
This lacks any kind of clever optimization, but when this is used inside a manual tiling loop, it is still very expressive and should produce good performance with no overhead (see multiply_einsum_tiles for example, although it doesn't actually produce good performance today, hopefully due only to https://bugs.llvm.org/show_bug.cgi?id=45863).
There's still a lot that could be done here, but this turned out to be fairly complete as a zero cost compile-time abstraction for Einstein summation.
This lacks any kind of clever optimization, but when this is used inside a manual tiling loop, it is still very expressive and should produce good performance with no overhead (see
multiply_einsum_tiles
for example, although it doesn't actually produce good performance today, hopefully due only to https://bugs.llvm.org/show_bug.cgi?id=45863).