Open Han8931 opened 4 years ago
Hi Han8931.
I think doublehead model is trained to solve two tasks: seq2seq and seq2label. You can check the code in their transformer.
In interact.py
, we can't predict the label, we need to generate a new answer. So only LMHead is enough(seq2seq).
Hello.
I am trying to run your model and I have some confusion in your pre-trained model.
It seems that
train.py
trained the model with doublehead model, but in theinteract.py
, the code loadsLMHeadmodel
.Why are they using different models?
So in the pre-trained model, actually, next sentence classification is not implemented?