dgasmith / opt_einsum

⚡️Optimizing einsum functions in NumPy, Tensorflow, Dask, and more with contraction order optimization.
https://dgasmith.github.io/opt_einsum/
MIT License
863 stars 68 forks source link

Possible bug in RandomGreedy path #137

Closed emprice closed 4 years ago

emprice commented 4 years ago

I have found an interesting quirk of the RandomGreedy optimizer that may not have been intended. In my application, I need to repeat two contractions several times. I use the same optimizer to compute the paths in advance and cache the result. If I use RandomGreedy, then I get the wrong shape for the second contraction, which leads me to believe that there's some internal state of RandomGreedy that persists into computing the second path. Below is a MWE.

import torch
import opt_einsum as oe

eqn1 = 'abc,bef,ahi,ehk,uvc,vwf,uxi,wxk->'
eqn2 = 'abd,beg,ahj,ehl,mndo,npgq,mrjs,prlt,uvo,vwq,uxs,wxt->'
shapes1 = [(1, 1, 2), (1, 1, 2), (1, 1, 2), (1, 1, 2), 
           (1, 1, 2), (1, 1, 2), (1, 1, 2), (1, 1, 2)] 
shapes2 = [(1, 1, 2), (1, 1, 2), (1, 1, 2), (1, 1, 2), (1, 1, 2, 2), (1, 1, 2, 2), 
           (1, 1, 2, 2), (1, 1, 2, 2), (1, 1, 2), (1, 1, 2), (1, 1, 2), (1, 1, 2)] 
tens1 = [torch.randn(*sh) for sh in shapes1]
tens2 = [torch.randn(*sh) for sh in shapes2]

for Opt in [oe.DynamicProgramming, oe.RandomGreedy]:
    opt = Opt(minimize='flops')

    expr1 = oe.contract_expression(eqn1, *[ten.shape for ten in tens1], optimize=opt)
    expr2 = oe.contract_expression(eqn2, *[ten.shape for ten in tens2], optimize=opt)

    print(expr1(*tens1))
    print(expr2(*tens2))

The output of this program, using the latest PyPI release, is:

tensor(-0.0187)
tensor(0.4303)
tensor(-0.0187)
tensor([[[[[-0.4418]],

          [[ 1.2737]]]]])

Lines 1 and 3 should match (and do), but so should 2 and 4 (and don't).

jcmgray commented 4 years ago

Ah yes this is maybe not clear in the docs, but an PathOptimizer instance is not generally reusable apart from on the same equation + shape. What's happening here is that opt sees a better contraction path on one of the equations so when it searches on the other it doens't find anything and supplies the old 'best' path for the other equation. DynamicProgramming doesn't have any state relating to the actual contraction so happens to work here.

If you put another opt = Opt(minimize='flops') in between the expr1 = ... and expr2 = ... , or use opt1, opt2, everything should be as expected.

The answer here might to save the first equation encountered and raise an informative error if a different one is then supplied.

dgasmith commented 4 years ago

We should definitely check this in the __call__. Simple enough to set the expression and shape upon first call and check it in subsequent calls. I don't like how the API feels with this since in some cases it works and others it doesn't. I'm not sure how else to do it beyond keeping per-contraction description state data within the class which feels messy.