Is code in openai_transformer.py runnable?

wojtekcz commented 5 years ago

Hi,

I'm trying to instantiate OpenaiTransformer class from openai_transformer.py code in transformer toy problem setting to learn your tsalib a bit. I'm curious to learn how useful tensor shape annotations would be in practice.

However, I've noticed some syntax level errors in code, e.g.: 1) in Conv1D.forward() body there is self.nx attribute referenced that is not defined. 2) is "def forward(self, x: (B, T, self.nx)) -> torch.Tensor:" syntax valid? It causes a crash 3) Attention.forward(self, x: (B, T, self.nx)) -> torch.Tensor: causes a crash "NameError: name 'self' is not defined" 4) no TransformerConfig class in source file 5) in class OpenaiTransformer.forward(self, x: (B,T)) -> List[B, T, D]:" last part "[B, T, D]" causes a crash "TypeError: Parameters to generic types must be types. Got Batch."

How would you go about correcting these problems?

Best, Wojtek

ekshaks commented 5 years ago

Thanks @wojtekcz for trying out tsalib. I've pushed in a fix for the above syntax errors in d7d151e1ab27ae12004e2ef66e293d4d77eb1824. Included the TransformerConfig class -- not sure why I deleted it earlier.

I includedopenai_transformer.py mainly to illustrate how one could annotate full models and rewrite complex shape transformations into one-liners. I figured getting it to "run" may involve bringing in too many dependencies from allennlp, so I avoided that. In case you can bring in a more complete example, please send a pull request. Also, the code can be improved much more, e.g., split_heads can be a crisp 2-liner, Conv1D code can be cleaned up.

Please also check out the BERT model (in tensorflow), which is runnable and relies on very similar transformations as openai_transformer.

I suspect you may have more questions and comments as you dive deeper into tsalib -- would be happy to discuss.

wojtekcz commented 5 years ago

Thanks for correcting syntax errors. I can confirm that it is possible to instantiate OpenaiTransformer model after adding few files from allennlp/common directory, namely: checks.py, file_utils.py, from_params.py, params.py and tqdm.py.

But it turned out that I mistook openai_transformer.py for transformer from 'Attention is all you need' paper. So my toy problem setup won't be of any help to you.

Can't use your annotated Tensorflow BERT model either because I mainly work with PyTorch nowadays :(

For now I'm working with harvardnlp's Annotated Transformer implementation updated for PyTorch 1.0 and I'm still looking for a better implementation that includes a decoder.

I'm still keen on trying tsalib out.

Best, Wojtek

wojtekcz commented 5 years ago

BTW, I know about huggingface/pytorch-pretrained-BERT implementation and utilize it in at work. Did you think about annotating this one?

ekshaks commented 5 years ago

The BERT pytorch version is very similar to the tf version structurally. So, I think most of the annotations/modifications in BERT tf can be lifted directly to the pytorch version. We can add it to the roadmap.

Do you have any thoughts on how you want to use the annotated versions?

wojtekcz commented 5 years ago

Yes, I have ;-)

My current use case for Transformer is multi-modal sequence to sequence problem. It is quite elaborate setup for language to human motion mapping with Gaussian Mixture Models sampling. I'm hoping tensor annotations may save me lots of headaches.

I've already spent too much time modifying harvardnlp's Annotated Transformer to support multi-modal use.

If annotated pytorch BERT is available, I'd try to utilize that implementation and add decoder it lacks.

ekshaks commented 5 years ago

Thanks @wojtekcz . I'm working on this item -- will get back.

ekshaks commented 5 years ago

Hey @wojtekcz, sorry for the long wait.

I've added the annotated version of the BERT model in the repository here. Added a suffix '_tsa' to the original file name to indicate the annotated version. Hope this helps. I'm going to annotate the other files in the repository too.

These annotations are generated "mostly" automatically from the tests by a code tracing and transformation tool. The tool is WIP so the annotations may be 'incomplete' at some places. Plan to also release the annotation tool in the near future.

ofnote / tsalib

Is code in openai_transformer.py runnable? #18