Open wolfshow opened 5 years ago
Currently, we only support standard BERT graph and standard BERT outputs, in your case the most related output type I guess is model.get_sequence_output()
with size [batch_size sequence_length hidden_size].
You can fork our repository and make some modification if you want [batch_size * sequence_length] (I guess you put some additional layers after standard BERT). Or you can also get our model.get_sequence_output()
and make the computation to another separate graph by tensoflow.
You can have a look at AdditionalOutputLayer
, which takes [batch_size, hidden_size] from model.get_pooled_output()
, and then apply a dot-product to produce [batch_size, 1] vector.
I am working on a sequence tagging task, where the logits output should be [batchsize * sequence length]. Does cuBERT support that?