Open EATMustard opened 3 months ago
Pytorch.BCEWithLogitsLoss performs the Sigmoid operation, but the Sigmoid is reused in the model code. Is this a bug?
class TokenConfidence(nn.Module):
def __init__(self, dim: int) -> None: super().__init__() self.token = nn.Sequential(nn.Linear(dim, 1), nn.Sigmoid()) # sigmoid once self.loss_fn = nn.BCEWithLogitsLoss(reduction="none") # sigmoid twice
I think this is a bug, I tried removing nn.Sigmoid in self.token and adding torch.sigmoid in forward(), but the results are almost the same. Have you tried?
Pytorch.BCEWithLogitsLoss performs the Sigmoid operation, but the Sigmoid is reused in the model code. Is this a bug?
class TokenConfidence(nn.Module):