Trying to write function definitions in a consistent way

davidweichiang commented 3 years ago

The usual way using \rightarrow and \mapsto is not ideal because:

with more than a couple of arguments, it's difficult to align arguments with types
including the arguments after the semicolon into the signature makes it even longer
several functions (e.g., conv2d) use intermediate variables

davidweichiang commented 3 years ago

This scheme presents a function as a chronological sequence of type assertions and equations. I don't like that there's nothing to introduce the definition, and the placement of the return type at the end is also a little weird looking.

@srush @boazbk

srush commented 3 years ago

Unfortunately I don't like the type at the end, particularly if it is only on some functions. I also find the "U =..." before the function definition to be strange and a bit incorrect? This is hard.

I know Boaz wasn't a fan, but I still think the function type signature is the best way to go. It seems concise and correct.
Perhaps if you indent the local variable "U=" under the function it will look less strange. Or maybe have a mandatory "where "?
I think the ";" was misguided on my part. We could write those as just functions.
I am worried the complexity of MultiAttention is causing problems. I feel like that case is rare (4 different weights) and it is okay for that function to be a special case. Or we could just cheat and write it as one tensor W^{mode[4], hidden, hidden'} and do renames of the outputs.

davidweichiang commented 3 years ago

OK, I can restore the ordering so that the function is back on top; maybe adding "where" will make the ordering easier to follow.

I think the semicolon is good, but am not sure if there's an accepted way of writing a function signature with semicolons. The case of maxpool2d is particularly tricky, because the return type depends on parameters kh and kw.

boazbk commented 3 years ago

It might better to use a subscript for parameters such as kh and kw that are not really learned parameters but more like hardwired constants

srush commented 3 years ago

I agree with all of the above and subscripts for small params. For weights I don't see why they wouldn't be in the signature. But I would be okay with those being above in the old style.

davidweichiang commented 3 years ago

Edited: OK, we can put kh and kw as part of the function name. Let me think about other cases.

davidweichiang commented 3 years ago

Here's conv2d with the full signature:

boazbk commented 3 years ago

I wonder if it's possible to have a stacked superscripts for the various axes. That is put weight, height etc on top of each other. This way the function signature won't stretch so much in the horizontal direction

davidweichiang commented 3 years ago

This version is pretty close to how I was writing functions previously. Examples in 3.5 not updated yet.

srush commented 3 years ago

I like this version a lot. It keeps the main part (above the where) clear and concise, but doesn't hide the important details (below the where).

davidweichiang commented 3 years ago

Great. Can you quickly check the beam search example -- I think there was a bug before, and I also changed it to update the states as well as the scores, so it's easy to imagine I introduced new bugs.

srush commented 3 years ago

This looks like an improvement to me. Merging. Should we call this v0.2?

Couple comments:

I think we should go through a full 2 layer neural network first to introduce the ^L / captured parameter notation. Transformer is hard enough that having the new notation makes it a sharp transition.
Did you mean to have Affine still have the type in the function?
We can make Sudoku simpler. 3 of the sums can be over X and Y can be now be row[3] x col[3] x subgrid[9] x assign[9]

davidweichiang commented 3 years ago

I agree, I want to make the feedforward section into a mini tutorial on some of the tricks we use when writing modules.
I'll need to think about whether Affine still wants to have an axis name under it.
OK, can you make the change to Sudoku?

boazbk commented 3 years ago

I'll be happy to make a pass as well once things stabilize (not so much on changing notation but on flow of writing)

On Wed, Dec 16, 2020 at 2:55 PM David Chiang notifications@github.com wrote:

I agree, I want to make the feedforward section into a mini tutorial on some of the tricks we use when writing modules.

I'll need to think about whether Affine still wants to have an axis name under it.

OK, can you make the change to Sudoku?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/namedtensor/notation/pull/27#issuecomment-746917712, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJYO6HUNTG7OPZFUUQ3JTTSVEGDNANCNFSM4U46SOYQ .

srush commented 3 years ago

sounds great. after that mini-tutorial, sudoku, and a @boazbk pass I'll tweet about it again to get some more feedback.

namedtensor / notation

Trying to write function definitions in a consistent way #27