allenai / tracie

Apache License 2.0
14 stars 6 forks source link

What do position 2 and 3 mean in the decoder_output? #4

Open lsy641 opened 2 years ago

lsy641 commented 2 years ago

lm_logits_start[:, 2, 1465].view(-1, 1) - lm_logits_start[:, 2, 2841].view(-1, 1) lm_logits_start[:, 3, self.discrete_value_ids[0]].view(-1, 1) What do index 2 and 3 map with?

Slash0BZ commented 2 years ago

That represents the token index in the output sequence. Your two examples come from different models. If I remember correctly, the first line is used when the model is supposed to output "answer: positive/negative," so that index 2 is the probability of vocab at position 2, with 1465 and 2841 representing "positive/negative" respectively.

lsy641 commented 2 years ago

That represents the token index in the output sequence. Your two examples come from different models. If I remember correctly, the first line is used when the model is supposed to output "answer: positive/negative," so that index 2 is the probability of vocab at position 2, with 1465 and 2841 representing "positive/negative" respectively.

Thank you. I have one more question. Why is the "answer: positive " always the target of input_start, I mean why it cannot be "answer: positive " or "answer: positive "? And in the pretrain duration data, really serves as supervision?

Slash0BZ commented 1 year ago

Not sure what you mean here, where do you see the target always being ? These IDs are actually used during the pre-training stage, so there is a semantic associated with different extra ids.