tensorflow / tensor2tensor

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Apache License 2.0
15.56k stars 3.51k forks source link

Transformer-XL architecture support in tensor2tensor #1604

Open ashu5644 opened 5 years ago

ashu5644 commented 5 years ago

Is there any model based on or supporting Transformer-XL architecture in tensor2tensor? Architecture is referenced with paper: https://arxiv.org/abs/1901.02860 Title: Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Mr-wang2016 commented 5 years ago

+1

mark-radai commented 5 years ago

https://github.com/tensorflow/tensor2tensor/commit/e5e7d4babf9c57d943a12f10124439fc50d5e2d5 - '''Transformer with memory in the style of Transformer-XL'''

mark-radai commented 5 years ago

afaik it doesn't support gpu out of the box right now, but that's nothing that cant be hacked around :P

ashu5644 commented 5 years ago

@mark-radai, Thanks for the answer !

aleksas commented 5 years ago

+1

vagarwal87 commented 5 years ago

+1..though it seems possibly already implemented, it's not the easiest to decrypt how precisely to use it without a clear example or documentation.

ashu5644 commented 4 years ago

Is there support for transformer xl type data pipeline in current tensor2tensor version?As data pipeline of xl will be much different from normal one.