Does cuBERT support sequence tagging output?

Currently, we only support standard BERT graph and standard BERT outputs, in your case the most related output type I guess is model.get_sequence_output() with size [batch_size sequence_length hidden_size].

You can fork our repository and make some modification if you want [batch_size * sequence_length] (I guess you put some additional layers after standard BERT). Or you can also get our model.get_sequence_output() and make the computation to another separate graph by tensoflow.

You can have a look at AdditionalOutputLayer, which takes [batch_size, hidden_size] from model.get_pooled_output(), and then apply a dot-product to produce [batch_size, 1] vector.

zhihu / cuBERT

Does cuBERT support sequence tagging output? #7