Open lsy641 opened 2 years ago
That represents the token index in the output sequence. Your two examples come from different models. If I remember correctly, the first line is used when the model is supposed to output "answer: positive/negative," so that index 2 is the probability of vocab at position 2, with 1465 and 2841 representing "positive/negative" respectively.
That represents the token index in the output sequence. Your two examples come from different models. If I remember correctly, the first line is used when the model is supposed to output "answer: positive/negative," so that index 2 is the probability of vocab at position 2, with 1465 and 2841 representing "positive/negative" respectively.
Thank you. I have one more question. Why is the "answer: positive
Not sure what you mean here, where do you see the target always being
lm_logits_start[:, 2, 1465].view(-1, 1) - lm_logits_start[:, 2, 2841].view(-1, 1) lm_logits_start[:, 3, self.discrete_value_ids[0]].view(-1, 1) What do index 2 and 3 map with?