namedtensor / notation

108 stars 5 forks source link

RNN problem #29

Closed davidweichiang closed 3 years ago

davidweichiang commented 3 years ago

Section 6.2 deals with the problem, most visible in RNNs, where you want a linear transformation from an axis to itself, which currently requires renaming. There are four solutions proposed:

and a fifth solution would be:

I think I'd like to cut this down to just one, and to make it part of the main document.

davidweichiang commented 3 years ago

Branch https://github.com/namedtensor/notation/tree/dual shows what some of these options look like for RNN, self-attention, and MVN.

I think the two-name contraction operator might have won. It's simple and it eliminates the most renamings (all but one). It just doesn't look very nice, but hopefully it wouldn't be used too much.

davidweichiang commented 3 years ago

Do any of these look better?

image

@srush @boazbk

srush commented 3 years ago

The first is definitely my favorite.

Some people like that section though :)

https://twitter.com/ocramz_yo/status/1339985255185477632

boazbk commented 3 years ago

I also prefer the first!

davidweichiang commented 3 years ago

First meaning matrix-shaped axes, or the two-name contraction?

boazbk commented 3 years ago

I mean that image

Is a good choice for dot product where we match the ax1 axis of A with the ax2 axis of B

srush commented 3 years ago

(also what I meant)

davidweichiang commented 3 years ago

Ah, ok. And among the different solutions, do you like that two-name contraction the best too?

srush commented 3 years ago

Yes. Its not the most interesting, but I think it fits our goals.