thunlp / PLMpapers

Must-read Papers on pre-trained language models.
MIT License
3.33k stars 436 forks source link

Added ALBERT Model #14

Open artisanbaggio opened 4 years ago

artisanbaggio commented 4 years ago

Added ALBERT Model in ppt and jpeg files.

DonaldTsang commented 4 years ago

Any other models you want to make note of? Also is it possible to add a description/comparison between each model?

artisanbaggio commented 4 years ago

At the moment, there are many ALBERT models in SQuAD, so there is no other model I would like to add. On the GLUE benchmark, T5 has been ranked first for a while and I am interested. T5 stands for "Text-to-Text Transfer Transformer". I'd like to see the progress a little more about whether it is worth mentioning.

DonaldTsang commented 4 years ago

What about Reformer from https://www.youtube.com/watch?v=rNG_hpSyZcE or https://ai.googleblog.com/2020/01/reformer-efficient-transformer.html ?

artisanbaggio commented 4 years ago

Thanks for the suggestion.

I understand that Reformer is a model that learns with less memory. I think it's a good idea to chunk close vector words with LSH(Local Sensitive Hash) Attention.

Reformer can learn with less memory than Transformer, so it may be a basic technology for future learning models. I don't know the exact accuracy, so I'll try it in my native Japanese first.

It is important that languages without spaces such as Chinese and Japanese have high accuracy. (There is no problem because Reformer's TOKENIZE is SentencePiece.)

DonaldTsang commented 4 years ago

SO what about the others in https://gluebenchmark.com/leaderboard and https://rajpurkar.github.io/SQuAD-explorer/ Are they just permutations of currently included models within this repo? Also what about the other benchmarks in https://www.tensorflow.org/datasets/catalog/overview ("text section")?

artisanbaggio commented 4 years ago

No. Reformer has not yet entered SQuAD or GLUE rankings. Reformer is described as "The Efficient Transformer", so I think it will replace Transformer or be used with Transformer.

DonaldTsang commented 4 years ago

@artisanbaggio I was referring to other entries in the benchmark, not Reformer.

artisanbaggio commented 4 years ago

OK, The model I am interested in in SQuAD and GLUE is the "T5" model as the first answer.