dgasmith / opt_einsum

⚡️Optimizing einsum functions in NumPy, Tensorflow, Dask, and more with contraction order optimization.
https://dgasmith.github.io/opt_einsum/
MIT License
863 stars 68 forks source link

Thoughts on making the numpy dependency optional? #203

Closed janeyx99 closed 2 months ago

janeyx99 commented 2 years ago

Proposal

Make the numpy dependency optional, if possible.

Why?

Minimizing dependencies is a general goal as it allows a bigger audience to reap the benefits of this library. More specifically, some of us are interested in making opt_einsum a hard dependency for torch, but we would like to keep numpy unrequired. (If you're curious why torch does not have a hard dependency on numpy, see https://github.com/pytorch/pytorch/issues/60081, tl;dr being the last comment.)

A hard dependency would mean all torch users would get the benefits of opt_einsum right away without thinking too hard/needing to manually install opt_einsum themselves.

Alternatives

We could also have torch vendor in opt_einsum, but that is increased complexity/maintenance + we would like to automatically subscribe to improvements in opt_einsum!

dgasmith commented 2 years ago

I think this is entirely possible as we have a fairly weak dependance on NumPy beyond testing. Feel free to take a crack at it, I can look into removing NumPy this weekend.

jcmgray commented 2 years ago

Yes I agree this would be a nice thing to do. From what I can tell minor problem points where one can't just import/mock numpy lazily are:

jcmgray commented 2 years ago

Here's an alternative pure python implementation of ssa_to_linear:

def ssa_to_linear_A(ssa_path):
    N = sum(map(len, ssa_path)) - len(ssa_path) + 1
    ids = list(range(N))
    path = []
    ssa = N
    for scon in ssa_path:
        con = sorted(map(ids.index, scon))
        for j in reversed(con):
            ids.pop(j)
        ids.append(ssa)
        path.append(con)
        ssa += 1
    return path

It's actually faster until one gets to quite large numbers of inputs (x-axis): ssa_to_linear

At the largest size here the actual main greedy algorithm takes ~7sec so the extra slowdown is not ideal but still only a small part of the whole path finding. Maybe the implementation can be improved too..

dgasmith commented 2 years ago

Nice check!

n is unlikely to go over 1000 except for extreme cases, we could also have two algorithms and a switch of:

if len(n) > 1000 and has_numpy:
    return _numpy_impl(*args)
else:
    return _python_impl(*args)

The library isn't strongly type hinted yet (still plenty of Any running around~). I would vote we replace np.ndarray with some library agnostic Protocol type.

jcmgray commented 1 year ago

I actually realized you can implement ssa_to_linear in $n\log(n)$ time:

import bisect

def ssa_to_linear_bis(ssa_path, N=None):
    if N is None:
        N = sum(map(len, ssa_path)) - len(ssa_path) + 1
    ids = list(range(N))
    path = []
    ssa = N
    for scon in ssa_path:
        con = sorted([bisect.bisect_left(ids, s) for s in scon])
        for j in reversed(con):
            ids.pop(j)
        ids.append(ssa)
        path.append(con)
        ssa += 1
    return path

this is significantly faster than the numpy version throughout the range.

dgasmith commented 4 months ago

@janeyx99 Can you test torch against current main?

@jcmgray This code snippet was awesome to replace the previous numpy version.

dgasmith commented 2 months ago

Added in 3.4.

janeyx99 commented 1 month ago

This is epic! Thanks for doing! This way, we can make opt-einsum an actual dependency of torch :D I just installed your release and it, as desired, does not pull in numpy.

dgasmith commented 1 month ago

@janeyx99 Awesome, let us know if we can do anything else.