This is my Trax implementation of GPT-2 (Transformer Decoder) for one of the Natural Language Generation task, Abstractive summarization.
Paper: Language Models are Unsupervised Multitask Learners.
Library: Trax - Deep Learning Library in JAX actively used and maintained in the Google Brain team.
Dataset: https://www.kaggle.com/shashichander009/inshorts-news-data