Closed davidweichiang closed 3 years ago
This scheme presents a function as a chronological sequence of type assertions and equations. I don't like that there's nothing to introduce the definition, and the placement of the return type at the end is also a little weird looking.
@srush @boazbk
Unfortunately I don't like the type at the end, particularly if it is only on some functions. I also find the "U =..." before the function definition to be strange and a bit incorrect? This is hard.
I know Boaz wasn't a fan, but I still think the function type signature is the best way to go. It seems concise and correct.
Perhaps if you indent the local variable "U=" under the function it will look less strange. Or maybe have a mandatory "where "?
I think the ";" was misguided on my part. We could write those as just functions.
I am worried the complexity of MultiAttention is causing problems. I feel like that case is rare (4 different weights) and it is okay for that function to be a special case. Or we could just cheat and write it as one tensor W^{mode[4], hidden, hidden'} and do renames of the outputs.
OK, I can restore the ordering so that the function is back on top; maybe adding "where" will make the ordering easier to follow.
I think the semicolon is good, but am not sure if there's an accepted way of writing a function signature with semicolons. The case of maxpool2d is particularly tricky, because the return type depends on parameters kh and kw.
It might better to use a subscript for parameters such as kh and kw that are not really learned parameters but more like hardwired constants
I agree with all of the above and subscripts for small params. For weights I don't see why they wouldn't be in the signature. But I would be okay with those being above in the old style.
Edited: OK, we can put kh and kw as part of the function name. Let me think about other cases.
Here's conv2d with the full signature:
I wonder if it's possible to have a stacked superscripts for the various axes. That is put weight, height etc on top of each other. This way the function signature won't stretch so much in the horizontal direction
This version is pretty close to how I was writing functions previously. Examples in 3.5 not updated yet.
I like this version a lot. It keeps the main part (above the where) clear and concise, but doesn't hide the important details (below the where).
Great. Can you quickly check the beam search example -- I think there was a bug before, and I also changed it to update the states as well as the scores, so it's easy to imagine I introduced new bugs.
This looks like an improvement to me. Merging. Should we call this v0.2?
Couple comments:
row[3] x col[3] x subgrid[9] x assign[9]
I'll be happy to make a pass as well once things stabilize (not so much on changing notation but on flow of writing)
On Wed, Dec 16, 2020 at 2:55 PM David Chiang notifications@github.com wrote:
- I agree, I want to make the feedforward section into a mini tutorial on some of the tricks we use when writing modules.
- I'll need to think about whether Affine still wants to have an axis name under it.
- OK, can you make the change to Sudoku?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/namedtensor/notation/pull/27#issuecomment-746917712, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJYO6HUNTG7OPZFUUQ3JTTSVEGDNANCNFSM4U46SOYQ .
sounds great. after that mini-tutorial, sudoku, and a @boazbk pass I'll tweet about it again to get some more feedback.
The usual way using \rightarrow and \mapsto is not ideal because: