huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.11k stars 27.04k forks source link

Routing Transformers / Add Google PG-19 Models #11686

Open GenTxt opened 3 years ago

GenTxt commented 3 years ago

🌟 New model addition - Google PG-19 Models

Model description

Model checkpoints finally released as discussed in "Efficient Content-Based Sparse Attention with Routing Transformers' Aurko Roy, Mohammad Saffar, Ashish Vaswani, David Grangier (https://arxiv.org/abs/2003.05997)

Open source status

Note: These tf2 models require proper conversion to pytorch versions and modifications to scripts to enable training and inference.

vblagoje commented 3 years ago

There is an open-source pytorch implementation already - https://github.com/lucidrains/routing-transformer Can't we adapt RT @lucidrains wrote to HF?

GenTxt commented 3 years ago

I've checked the repo before and was hoping with the release of the models this would be possible.

The original models may be tf1 and not tf2 format. This requires a custom conversion script to pytorch.

Perhaps coders with advanced python skills will show interest in solving the above issues.

On Wed, Jul 14, 2021 at 7:53 AM vblagoje @.***> wrote:

There is an open-source pytorch implementation already - https://github.com/lucidrains/routing-transformer Can't we adapt RT @lucidrains https://github.com/lucidrains wrote to HF?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/huggingface/transformers/issues/11686#issuecomment-879827190, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFMAWPKXKV5UXM2HFEQ57U3TXV3ERANCNFSM44WG7YGA .