Closed daiwk closed 5 years ago
My 2 cents is the rule "keep it simple" like fastText. You get a binary all-in-one solution, no framework, ELF file compiled from C++ sources (tensorflow core), static linked, anything else. No boost library, no tensorflow, no tensor2tensor python dependencies. This is a win-win solution. I beat for that.
Maybe, I'm working on it. https://github.com/guotong1988/BERT-tensorflow
tensor2tensor is a great library but has more layers of abstraction than what we want for BERT. BERT is designed to be very lightweight and standalone with minimal abstraction (the core library is 3 files: modeling.py
, tokenization.py
, and optimziation.py
, with no dependencies and simple APIs for using each).
@jacobdevlin-google I love this approach, very close to my initial idea!!! Thanks a lot!
Please, somebody correct me if I am wrong, but shouldn't BERT be possible with tensor2tensor since BERTs core is just a Transformer
model?
From bert:
We then train a large model (12-layer to 24-layer Transformer) on a large corpus (Wikipedia + BookCorpus) for a long time (1M update steps), and that's BERT.
Basically: lots of data + lots of training + lots of transformer = BERT
Shouldn't it be enough to "just" create a dataset accordingly and start a training with that?
Could this official repository https://github.com/tensorflow/tensor2tensor support bert?