This patch contains the code discussed about in #24.
Its goal is to simplify the allow users to use this model for many different tasks, as presented in the research paper. For example, let's say you want to finetune the network to classify texts, you just have to create a DoubleHeadModel with a classification head and use the ClassificationLossCompute class.
The SimilarityHead has not been tested yet and the SimilarityLossCompute is missing as I don't know how this kind of task works.
This patch contains the code discussed about in #24.
Its goal is to simplify the allow users to use this model for many different tasks, as presented in the research paper. For example, let's say you want to finetune the network to classify texts, you just have to create a
DoubleHeadModel
with a classification head and use theClassificationLossCompute
class.The
SimilarityHead
has not been tested yet and theSimilarityLossCompute
is missing as I don't know how this kind of task works.