I'm finetuning a Bert model using contrastive loss to further use it on an imbalanced binary classification task.
Let M be the number of sentences with label 1, and N with label 0. and the 1:0 ratio is 0.3
for each sentence in M : i randomly select 10 negative samples from N to construct 20 negative contrastive examples to reach the best class separation.
and again for each sentence in M i select 10 positive samples from M also.
and for each sentence in N i select 10 positive samples from N.
the training set is around 170k examples and dev is constructed using the same way, since there's no support for early stopping call back, how many epochs should i train the model for ?
I'm finetuning a Bert model using contrastive loss to further use it on an imbalanced binary classification task. Let M be the number of sentences with label 1, and N with label 0. and the 1:0 ratio is 0.3 for each sentence in M : i randomly select 10 negative samples from N to construct 20 negative contrastive examples to reach the best class separation. and again for each sentence in M i select 10 positive samples from M also. and for each sentence in N i select 10 positive samples from N.
the training set is around 170k examples and dev is constructed using the same way, since there's no support for early stopping call back, how many epochs should i train the model for ?