Closed shoyer closed 6 years ago
@shoyer Interesting, can you write up a small example that I can hack on?
I don't know how interesting an example it is, but here's the one that turned up in our project:
sizes = {'a': 29, 'b': 29, 'c': 4, 'd': 4, 'f': 3, 'i': 3, 'j': 10, 'o': 3, 's': 50}
einsum_string = 'sabj,sfab,ifo,cdj,siac->sobd'
The first three terms are fixed, the last two are vary in each iteration.
Sorry it took me a bit to get back to this. If you look at the full contraction we see:
Complete contraction: sabj,sfab,ifo,cdj,siac->sobd
Naive scaling: 9
Optimized scaling: 6
Naive FLOP count: 1.453e+09
Optimized FLOP count: 1.482e+07
Theoretical speedup: 98.028
Largest intermediate: 5.046e+05 elements
--------------------------------------------------------------------------------
scaling BLAS current remaining
--------------------------------------------------------------------------------
6 TDOT siac,ifo->focas sabj,sfab,cdj,focas->sobd
6 False focas,sfab->ocabs sabj,cdj,ocabs->sobd
6 False ocabs,sabj->ocjbs cdj,ocjbs->sobd
6 TDOT ocjbs,cdj->sobd sobd->sobd
The possible intermediates to form are:
sabj,sfab,ifo->objia
(N^7 scaling, 13x cost)sabj,sfab->abjf
(N^7 scaling, 12x cost)sfab,ifo->sabio
(N^6 scaling, 0.9x cost)sabj,ifo->sabjifo
(N^9 scaling, ...)So it doesn't look quite worth it for this particular example, I think I know how to build something that automatically detects this however.
So it doesn't look quite worth it for this particular example
Sorry, I should have clarified -- we already did this analysis by hand and came to the same conclusion :). But yes, in general this would be nice to have. I suspect the pattern of updating only some tensors inside an optimization and/or simulation loop is pretty common.
Well at least we are coming up with the same thing! Any thoughts on the API if we have a function that looks like the following:
constant_contract(expression, constant_tensors, tensors)
where the order in the expression would be constant_tensors + tensors
. This could return:
(new_expression, temporaries)
so you could call contract(new_expression, *(temporaries + args))
func(tensors)
I think this is also possible now, to an extent, using dask
as the backend with #17 since it will reuse intermediaries if the arrays are kept the same.
Good point, I wonder if it would be worth it to revisit this in a more explicit manner however. If I think about the ML libraries it would be much better if we could reuse parts that would stay on a GPU.
Indeed. I have a bit of my library that can view any tensor network as a scipy.LinearOperator
, allowing for various nice iterative decompositions just involving matvec
. Obviously between matvec
calls the tensors are all constant apart from the new vector. I think the contractions generally start with the vector and contract everything into it (and so there are no reusables), but it would be nice to automatically detect when this is not the case.
My suggestion for an api would be:
>>> # assume we have four shapes and matching arrays with 2 & 4 constant
>>> expr = contract_expression("abc,bcd,cde,ef", shape1, array2, shape3, array4)
>>> out = expr(array1, array3)
I.e. if you supply arrays to a contract_expression
it internalizes them, and then to the expression itself you only supply arrays matching the remaining shapes.
EDIT: I think this api would be pretty easy to implement, but short of just using dask internally, working out the reusables would be a bit more involved.
What if someone does contract_expression("i,i->", (1, 2, 3), np.array([1, 2, 3]))
expecting to take an inner product? I think I was stumped for a clean API last time this came up.
Ah yes. I think you'd have to explicitly enforce a 'if it doesn't have a shape, it's not an array' policy. Producing helpful errors to that effect would be possible since shapes are only 1D and wouldn't match the expression if viewed as arrays but I see your point.
Another option would just be:
contract_expression("abc,bcd,cde,ef", shape0, array1, shape2, array3, constant=[1, 3])
I could be more comfortable with that. shape0
/shape2
could also be arrays with shape attributes as well.
I have an application where there is a large tensor contraction in the inner loop, with some arrays that are fixed and others that are updated between iterations. Operations only involving the constant terms could done ahead of time (outside the loop), but this may or may not actually improve performance.
It could be nice if opt-einsum had a way to mark some arguments as constant, in which case the cost of contracting them them could neglected when computing the cost of paths only involving these arguments.