zhihu / cuBERT

Fast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKL
MIT License
528 stars 83 forks source link

Does cuBERT support sequence tagging output? #7

Open wolfshow opened 5 years ago

wolfshow commented 5 years ago

I am working on a sequence tagging task, where the logits output should be [batchsize * sequence length]. Does cuBERT support that?

levyfan commented 5 years ago

Currently, we only support standard BERT graph and standard BERT outputs, in your case the most related output type I guess is model.get_sequence_output() with size [batch_size sequence_length hidden_size].

You can fork our repository and make some modification if you want [batch_size * sequence_length] (I guess you put some additional layers after standard BERT). Or you can also get our model.get_sequence_output() and make the computation to another separate graph by tensoflow.

You can have a look at AdditionalOutputLayer, which takes [batch_size, hidden_size] from model.get_pooled_output(), and then apply a dot-product to produce [batch_size, 1] vector.