google-research / bert

TensorFlow code and pre-trained models for BERT
https://arxiv.org/abs/1810.04805
Apache License 2.0
37.91k stars 9.57k forks source link

Could tensor2tensor support bert? #3

Closed daiwk closed 5 years ago

daiwk commented 5 years ago

Could this official repository https://github.com/tensorflow/tensor2tensor support bert?

loretoparisi commented 5 years ago

My 2 cents is the rule "keep it simple" like fastText. You get a binary all-in-one solution, no framework, ELF file compiled from C++ sources (tensorflow core), static linked, anything else. No boost library, no tensorflow, no tensor2tensor python dependencies. This is a win-win solution. I beat for that.

guotong1988 commented 5 years ago

Maybe, I'm working on it. https://github.com/guotong1988/BERT-tensorflow

jacobdevlin-google commented 5 years ago

tensor2tensor is a great library but has more layers of abstraction than what we want for BERT. BERT is designed to be very lightweight and standalone with minimal abstraction (the core library is 3 files: modeling.py, tokenization.py, and optimziation.py, with no dependencies and simple APIs for using each).

loretoparisi commented 5 years ago

@jacobdevlin-google I love this approach, very close to my initial idea!!! Thanks a lot!

stefan-falk commented 5 years ago

Please, somebody correct me if I am wrong, but shouldn't BERT be possible with tensor2tensor since BERTs core is just a Transformer model?

From bert:

We then train a large model (12-layer to 24-layer Transformer) on a large corpus (Wikipedia + BookCorpus) for a long time (1M update steps), and that's BERT.

Basically: lots of data + lots of training + lots of transformer = BERT

Shouldn't it be enough to "just" create a dataset accordingly and start a training with that?