huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.85k stars 26.97k forks source link

considerd to add albert? #1370

Closed fengzuo97 closed 4 years ago

fengzuo97 commented 5 years ago

🚀 Feature

Motivation

Additional context

gooofy commented 5 years ago

Would definitely love to see an implementation of ALBERT added to this repository. Just for completeness:

That said, it could be even more interesting to implement the core improvements (factorized embedding parameterization, cross-layer parameter sharing) from ALBERT in (some?/all?) other transformers as optional features?

BramVanroy commented 5 years ago

Knowing how fast the team works, I would expect ALBERT to be implemented quite soon. That being said, I haven't had time to read the ALBERT paper yet, so it might be more difficult than previous BERT iterations such as distilbert and RoBERTa.

ghost commented 5 years ago

I think ALBERT is very cool! Expect...

wassname commented 5 years ago

And in pytorch (using code from this repo and weights from brightmart) https://github.com/lonePatient/albert_pytorch

sarim-zafar commented 5 years ago

Any Update on the progress?

BramVanroy commented 5 years ago

The ALBERT paper will be presented at ICLR in April 2020. From what I last heard, the huggingface team has been talking with the people over at Google AI to share the details of the model, but I can imagine that the researchers rather wait until the paper has been presented. One of those reasons being that they want to get citations from their ICLR talk rather than an arXiv citation which, in the field, is "worth less" than a big conference proceeding.

For now, just be patient. I am sure that the huggingface team will have a big announcement (follow their Twitter/LinkedIn channels) with a new version bump. No need to keep bumping this topic.

roccqqck commented 5 years ago

https://github.com/interviewBubble/Google-ALBERT

tholor commented 5 years ago

The official code and models got released :slightly_smiling_face: https://github.com/google-research/google-research/tree/master/albert

kamalkraj commented 5 years ago

[WIP] ALBERT in tensorflow 2.0 https://github.com/kamalkraj/ALBERT-TF2.0

lonePatient commented 5 years ago

https://github.com/lonePatient/albert_pytorch

Dataset: MNLI Model: ALBERT_BASE_V2 Dev accuracy : 0.8418

Dataset: SST-2 Model: ALBERT_BASE_V2 Dev accuracy :0.926

stefan-it commented 5 years ago

PR was created, see here:

https://github.com/huggingface/transformers/pull/1683

kamalkraj commented 5 years ago

[WIP] ALBERT in tensorflow 2.0 https://github.com/kamalkraj/ALBERT-TF2.0

Verison 2 weights added. Support for SQuAD 1.1 and 2.0 added. Reproduces the same results from paper. From my experiments, ALBERT model is very sensitive to hyperparameter like Batch Size. FineTuning using AdamW as Default in Original Repo. AdamW performs better than LAMB on Model finetuning.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.