GenTxt commented 3 years ago

🌟 New model addition - Google PG-19 Models

Model description

Model checkpoints finally released as discussed in "Efficient Content-Based Sparse Attention with Routing Transformers' Aurko Roy, Mohammad Saffar, Ashish Vaswani, David Grangier (https://arxiv.org/abs/2003.05997)

Open source status

[X ] the model implementation is available: (same link as below)
[ X] the model weights are available: ( https://github.com/google-research/google-research/tree/master/routing_transformer)
[X ] who are the authors: (see above)

Note: These tf2 models require proper conversion to pytorch versions and modifications to scripts to enable training and inference.

vblagoje commented 3 years ago

There is an open-source pytorch implementation already - https://github.com/lucidrains/routing-transformer Can't we adapt RT @lucidrains wrote to HF?

GenTxt commented 3 years ago

I've checked the repo before and was hoping with the release of the models this would be possible.

The original models may be tf1 and not tf2 format. This requires a custom conversion script to pytorch.

Perhaps coders with advanced python skills will show interest in solving the above issues.

On Wed, Jul 14, 2021 at 7:53 AM vblagoje @.***> wrote:

There is an open-source pytorch implementation already - https://github.com/lucidrains/routing-transformer Can't we adapt RT @lucidrains https://github.com/lucidrains wrote to HF?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/huggingface/transformers/issues/11686#issuecomment-879827190, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFMAWPKXKV5UXM2HFEQ57U3TXV3ERANCNFSM44WG7YGA .

huggingface / transformers

Routing Transformers / Add Google PG-19 Models #11686

🌟 New model addition - Google PG-19 Models

Model description

Open source status