Generating Adoption Ideas : Writing Docs

srush commented 3 years ago

Once we converge on some of the last details, I think we may need to think about an adoption strategy. It seems unlikely to me that people will naturally pick up this notation, so we should do something to encourage usage. Some ideas:

Idea 1: Rewrite the Keras / PyTorch Docs using our notation.

I think a lot of people learn about NNs using library docs. However, these are extremely confusing and ambiguous, and we could fix them right away.

LayerNorm versus BatchNorm : (https://pytorch.org/docs/stable/generated/torch.nn.LayerNorm.html / https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm1d.html).
Conv2d : https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html ,
Attention : https://pytorch.org/docs/stable/generated/torch.nn.MultiheadAttention.html
RNN : https://pytorch.org/docs/stable/generated/torch.nn.RNN.html

Not sure the best strategy here, but I know the pytorch folks . Maybe we send them a PR? Alternatively we could host our own version of the docs.

Idea 2: Blog post that walks through Transformer with the new notation.
Idea 3: Some sort of beta call to action among people we know. I.e. at version 1 send an email requesting that people give it a try.

davidweichiang commented 3 years ago

These all sound like good ideas to me. Idea 1 seems like a lot of work, but speaking of the PyTorch docs, I wonder if it would be good for adoption to try to make our notation look a little more like theirs, e.g., with indexing as A(foo1) instead of A{foo(1)}.

srush commented 3 years ago

Yeah, it would be a lot of work. Maybe we could crowd source some of it. It's kind of worrisome that this is how people are learning ML.

Their notation is really inconsistent. They use A(foo_1) for conv2d but RNN is matrix notation -

https://pytorch.org/docs/stable/generated/torch.nn.RNN.html

Log softmax is just wrong

https://pytorch.org/docs/stable/generated/torch.nn.LogSoftmax.html#torch.nn.LogSoftmax

srush commented 3 years ago

This one uses square brackets :( And also a magical index that works on arbitrary shapes.

https://pytorch.org/docs/stable/generated/torch.nn.SoftMarginLoss.html#torch.nn.SoftMarginLoss

davidweichiang commented 3 years ago

Although the namedtensor.sty is really minimal, we could submit it to CTAN. I am not sure what TeXLive's vetting process is, but if they pick up namedtensor.sty, then most people would have it installed already.

srush commented 3 years ago

I tried converting a couple, but working with sphinx was pretty annoying. Need to first figure out how to get our macros ported over.

Given that this will be some work, I emailed to pytorch PMs to see if they would be interested first.

srush commented 3 years ago

Python PEP for TSP / named tensor like syntax

Axis1 = TypeVar('Axis1')
Axis2 = TypeVar('Axis2')

class Array1(Generic[Axis1]): ...

class Array2(Generic[Axis1, Axis2]): ...

https://github.com/mrahtz/peps/blob/11511ad43dcde142f2c0b8987feca59aa9bd735f/pep-0646.rst#syntax-proposal

davidweichiang commented 3 years ago

Also https://www.python.org/dev/peps/pep-0637/

namedtensor / notation

Generating Adoption Ideas : Writing Docs #24