Closed secsilm closed 3 years ago
Hi, unfortunately, this does seem to be a bug in the constrained training portion of the code - thanks for pointing it out. It still seems to work because the constraints are applied after the warmup phase where the model is already trained with the Cross-Entropy loss and hence has good knowledge of it, already. Since the reported results can be replicated with this version of the code only, I am hesitant to change the code. I will update the README to reflect this issue. My apologies for the confusion!
In the paper, the final loss is obtained by adding constrained loss to cross entropy:
But in
model.py
, it seems that only the constrained loss is used:If I understand the paper correctly, should
loss = const_loss
be changed toloss += const_loss
?