aesara-devs / aesara

Aesara is a Python library for defining, optimizing, and efficiently evaluating mathematical expressions involving multi-dimensional arrays.
https://aesara.readthedocs.io
Other
1.18k stars 155 forks source link

Remove theano.dot #52

Closed brandonwillard closed 3 years ago

brandonwillard commented 4 years ago

The __init__.py-level theano.dot function is awkward and—as far as I can tell—completely unnecessary. What it appears to be doing is supposed to be done in class-level implementations of __dot__ and __rdot__ (e.g. an interface/mixin class that checks both objects for the appropriate methods).

This function should be removed and the classes that implement __dot__ and __rdot__ (e.g. theano.tensor.var._tensor_py_operators) should handle this.

twiecki commented 4 years ago

But there is still tt.dot(x, y) then, right? Or only x.dot(y)?

brandonwillard commented 4 years ago

There's already a distinct theano.tensor.basic.dot function on top of the class-level _tensor_py_operators.dot method, which makes this theano.dot all the more confusing.

twiecki commented 4 years ago

Yep it really does.

On Mon, Sep 28, 2020, 19:45 Brandon T. Willard notifications@github.com wrote:

There's already a distinct theano.tensor.basic.dot function on top of the class-level _tensor_py_operators.dot method, which makes this theano.dot all the more confusing.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pymc-devs/Theano-PyMC/issues/52#issuecomment-700184173, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFETGBETVYWRAA5FGPDWELSIDDUBANCNFSM4R32JGEQ .

eigenfoo commented 3 years ago

I'd be interested in taking this on, but am unclear on some things.

What it appears to be doing is supposed to be done in class-level implementations of __dot__ and __rdot__ (e.g. an interface/mixin class that checks both objects for the appropriate methods).

Please correct me if I'm wrong: Theano tensors derive a lot of operators (addition, comparisons, etc.) via mixin classes, for example the theano.tensor.var._tensor_py_operators class. theano.dot is barely used, but once it's gone the only way to take dot products will be with these mixins.

If my understanding is right, I have two questions so that I can check that I'm not inadvertently deprecating dot products (links would be appreciated!):

  1. How many mixin classes are there that supply tensor operators?
  2. How many different kinds/classes of Theano tensors are there? Do they all use the same mixins?
brandonwillard commented 3 years ago

I think that's the mixin, other than its subset clone: theano.sparse.basic._sparse_py_operators.

While I'm thinking about it, we should either use a single mixin for both—with some overrides, if necessary—or break theano.tensor.var._tensor_py_operators apart so that its shared methods can be used by both the sparse and non-sparse types. Really, the sparse tensors should use exactly the same interfaces/types (and much of the same code) as the dense tensors, since we're supposed to be modeling high-level tensors with these interfaces, and not their implementation details. (This needs to become its own issue.)

Anyway, my impression is that theano.dot can be removed/replaced with versions of theano.tensor.var.[_tensor_py_operators | _sparse_py_operators].__dot__ that worked like the built-in dunder methods (e.g. have them return NotImplemented and add logic that checks __rdot__ in such cases).