dgasmith / opt_einsum

⚡️Optimizing einsum functions in NumPy, Tensorflow, Dask, and more with contraction order optimization.
https://dgasmith.github.io/opt_einsum/
MIT License
822 stars 67 forks source link

Thoughts on making the numpy dependency optional? #203

Open janeyx99 opened 1 year ago

janeyx99 commented 1 year ago

Proposal

Make the numpy dependency optional, if possible.

Why?

Minimizing dependencies is a general goal as it allows a bigger audience to reap the benefits of this library. More specifically, some of us are interested in making opt_einsum a hard dependency for torch, but we would like to keep numpy unrequired. (If you're curious why torch does not have a hard dependency on numpy, see https://github.com/pytorch/pytorch/issues/60081, tl;dr being the last comment.)

A hard dependency would mean all torch users would get the benefits of opt_einsum right away without thinking too hard/needing to manually install opt_einsum themselves.

Alternatives

We could also have torch vendor in opt_einsum, but that is increased complexity/maintenance + we would like to automatically subscribe to improvements in opt_einsum!

dgasmith commented 1 year ago

I think this is entirely possible as we have a fairly weak dependance on NumPy beyond testing. Feel free to take a crack at it, I can look into removing NumPy this weekend.

jcmgray commented 1 year ago

Yes I agree this would be a nice thing to do. From what I can tell minor problem points where one can't just import/mock numpy lazily are:

jcmgray commented 1 year ago

Here's an alternative pure python implementation of ssa_to_linear:

def ssa_to_linear_A(ssa_path):
    N = sum(map(len, ssa_path)) - len(ssa_path) + 1
    ids = list(range(N))
    path = []
    ssa = N
    for scon in ssa_path:
        con = sorted(map(ids.index, scon))
        for j in reversed(con):
            ids.pop(j)
        ids.append(ssa)
        path.append(con)
        ssa += 1
    return path

It's actually faster until one gets to quite large numbers of inputs (x-axis): ssa_to_linear

At the largest size here the actual main greedy algorithm takes ~7sec so the extra slowdown is not ideal but still only a small part of the whole path finding. Maybe the implementation can be improved too..

dgasmith commented 1 year ago

Nice check!

n is unlikely to go over 1000 except for extreme cases, we could also have two algorithms and a switch of:

if len(n) > 1000 and has_numpy:
    return _numpy_impl(*args)
else:
    return _python_impl(*args)

The library isn't strongly type hinted yet (still plenty of Any running around~). I would vote we replace np.ndarray with some library agnostic Protocol type.

jcmgray commented 10 months ago

I actually realized you can implement ssa_to_linear in $n\log(n)$ time:

import bisect

def ssa_to_linear_bis(ssa_path, N=None):
    if N is None:
        N = sum(map(len, ssa_path)) - len(ssa_path) + 1
    ids = list(range(N))
    path = []
    ssa = N
    for scon in ssa_path:
        con = sorted([bisect.bisect_left(ids, s) for s in scon])
        for j in reversed(con):
            ids.pop(j)
        ids.append(ssa)
        path.append(con)
        ssa += 1
    return path

this is significantly faster than the numpy version throughout the range.

dgasmith commented 15 hours ago

@janeyx99 Can you test torch against current main?

@jcmgray This code snippet was awesome to replace the previous numpy version.