Open broken-dream opened 2 years ago
I think you maybe want to take a look of the data.py and datasets.py,which are the modules to load your dataset to the trainning model.
@tarvaina So sorry to bother you several years on, but I have this same question. During training when some number of the samples in each batch are unlabeled (-1
) how is it that the class loss is being calculated on both labeled and unlabeled samples?
In this line here:
class_loss = class_criterion(class_logit, target_var) / minibatch_size
we have class_logit, which for some batch size of 8 should be of dimension 8 x 1000
. Then we have target_var
, which is some tensor of length 8 representing the classes (ex: [-1, -1, -1, -1, 2, 4, 5, 5]
.
Can you clarify how this is working? Why are the unlabeled samples being taken into account? Thank you so much!
Replying in case this is helpful for others: when the class_criterion
is created, NO_LABEL
labels are set to be ignored. I believe this handles things correctly!
class_criterion = nn.CrossEntropyLoss(reduction='sum', ignore_index=NO_LABEL).cuda()
I want to transfer the MT framework to a NLP task but I don't understand how to train it with unlabeled data. I have got the idea of the paper, but i'm confusing about the implementation.
I notice that the
TwoStreamBatchSampler
divides the dataset into labeled part and unlabeled part, but the code above seems handles both labeled and unlabeled data in a universal way. I think only the labeled part ofmodel_out
should be used to calculate theclass_loss
. Did I get it wrong?