Closed Linux-cpp-lisp closed 3 years ago
A single dimension changing leaving the contraction order unchanged isn't generally true, but occurs in many scenarios. A popular counter example is the following:
A_ij B_jk C_kl -> D_il
Where ijl
are fixed, if k
is smaller or larger than the rest of the indices the optimal order will switch between A(BC)
and (AB)C
.
I'll just add that if one takes 'batch dimension' to mean an index of size d
that appears on all the inputs and in the output, then in terms of contraction cost, it scales every pairwise contraction cost by d
and so yes the path optimization is completely independent.
In some of your equations the index z
doesn't appear on all inputs but the statement will probably mostly hold still if the index appears on all the 'expensive' inputs / contractions.
@dgasmith thanks for the example — it seems there, though, that you are contracting over the index whose size you are changing.
@jcmgray thanks for confirming this. My quick-and-dirty reasoning suggests that this should also hold for the einsums I showed above:
uvw,ijk,zuvij->zwk
zuvw,ijk,zuvij->zwk
because
z
z
z
z
.Does that seem right? I think this reasoning should hold for any einsum where the batch dimension is on either all but one operands or is on all but two and those two share no indexes.
It will hold when z appears in every contraction - like you have found, if the terms that it doesn't appear on (could be many more than 2) don't share any indices, they will (almost) never be contracted together first - either because the path optimizers don't search outer products, or because outer products rarely appear in good contraction paths.
There are some very perverse cases where the optimal path involves an outer product, e.g. I think when u, v, w, i, j, k are all size 1, but I don't think you need to worry about those.
That makes sense @jcmgray, thanks for your help!
i,j,k for us is always >1 (we handle 1 in our own special case). I guess in the case where all of those dimensions are 1 almost any contraction will run fast anyway.
This appears to be closed, thank you for the discussion!
Hi all,
Thanks for your work on this great project!
I'm working on applying opt_einsum in
e3nn
, an application where I need to predetermine a good contraction path for later use. Typical examples of einsums I need to do this for are:where all dimensions except
z
— the batch dimension — are of fixed, known size.I've run a small experiment to see how the optimal contraction varies with
z
(using the'optimal'
optimization strategy) and got the result that the optimal contraction is completely unaffected: Here each line is for a different einsum. "Path taken" just indexes unique (to each einsum) paths returned byopt_einsum.contract_path
.Is this universally true? That is, when all shapes except a batch dimension that is not contracted over are fixed, is the optimal path invariant to the size of the batch dimension?
Thanks for your time!