Separius / BERT-keras

Keras implementation of BERT with pre-trained weights
GNU General Public License v3.0
815 stars 197 forks source link

Build classifier on top of BERT #20

Closed iliaschalkidis closed 5 years ago

iliaschalkidis commented 5 years ago

Is there any possible way to train a BERT-based classifier, using [CLS] vector, as described in BERT article?

I was able to load BERT encoder successfully using:

bert_encoder = load_google_bert(base_location='./google_bert/uncased_L-12_H-768_A-12/',
                                use_attn_mask=False, max_len=512, verbose=False)

, but I am not able to find a workaround in order to wrap the encoder in a keras Model (wrapper).

I was looking forward to something like that:

bert_encoder = load_google_bert(base_location='./google_bert/uncased_L-12_H-768_A-12/',
                                use_attn_mask=False, max_len=512, verbose=False)
outputs = Dense(n_classes, activation='softmax')(bert_encoder.outputs)

classifier = Model(inputs=bert_encoder.inputs, outputs=outputs)
classifier.compile()
classifier.fit()
.....

If there is such a solution, it would be the easiest way possible to use BERT!

Separius commented 5 years ago

Hi @iliaschalkidis, you can use train_model with from transformer.train import train_model and define a non token-level Task and train the model on your sentence-level task(which uses [CLS] internally) you can take a look at tutorial.ipynb and read this in code.

iliaschalkidis commented 5 years ago

Hi @Separius, this looks really promising. Although I want to build something more complicated (custom), so I am going to read your code for train_model and start prototyping my model based on these principles.

I wish Google had already port BERT into tensorhub and made it as easy to use as ELMo tensorhub module.

Thanks a lot for your quick response! Have a great week!