vilmarzti / long_context_transformers

This is the repository for my Master Thesis where I analyse transformer architectures for long contexts
GNU General Public License v3.0
2 stars 0 forks source link

Routing Transformer weights available if useful #11

Open GenTxt opened 2 years ago

GenTxt commented 2 years ago

I posted a new model request last May which provides links to google Routing Transformer models trained on PG-19 Corpus.

Perhaps these can be converted to run with huggingface transformers?

https://github.com/google-research/google-research/tree/master/routing_transformer

https://github.com/huggingface/transformers/issues/11686

New model addition - Google PG-19 Models Model description

Model checkpoints finally released as discussed in "Efficient Content-Based Sparse Attention with Routing Transformers' Aurko Roy, Mohammad Saffar, Ashish Vaswani, David Grangier (https://arxiv.org/abs/2003.05997) Open source status

[X ] the model implementation is available: (same link as below)
[ X] the model weights are available: ( https://github.com/google-research/google-research/tree/master/routing_transformer)
[X ] who are the authors: (see above)

Good luck with your project.

vilmarzti commented 2 years ago

Hi thanks, this is useful information :) I have some other stuff to fix, but this will definitively come on to the list