Closed wise-east closed 3 years ago
Is __SILENCE__
in line 170 in the teachers.py
specifically for this purpose as mentioned in #2188?
We usually handle this by inserting a fake __SILENCE__
turn
@stephenroller Thank you for the reply!
Does that mean __SILENCE__
processed as a special token by all ParlAI models? And if we add a fake __SILENCE__
turn, doesn't that make the model also learn to generate X1 from __SILENCE__
?
Could you also give me insights on how I can do this from the original post:
Can we train a model to learn only a subset of turns while providing the full conversation history? For instance, if I have X1 -> Y1 -> X2 -> Y2 -> X3 -> Y3 and only want my model to learn Y3, is there a way to do that with ParlAI? More specifically, can I provide all the previous turns as context only with an understanding that X1 and X2 came from speaker X and Y1, Y2 came from speaker Y without actually providing them also as training samples when learning how to generate Y3?
Yes, we teach the model p(x1|silence) in order to teach it x1.
@stephenroller Thank you!
Could you give any insight on a related question: whether the current ParlAI framework allows training a model to learn only a subset of turns while providing the full conversation history?
For instance, if I have X1 -> Y1 -> X2 -> Y2 -> X3 -> Y3 and only want my model to learn Y3, is there a way to do that with ParlAI? More specifically, can I provide all the previous turns as context only with an understanding that X1 and X2 came from speaker X and Y1, Y2 came from speaker Y without actually providing them also as training samples when learning how to generate Y3?
You're best bet there is to flatten the dataset and only utilize the final turn. See the flatten mutator as a sketch of the solution (without filtering).
@stephenroller awesome, I'll take a look. thank you so much for your quick responses!
Bug description
Given a conversation with X1 -> Y1 -> X2 -> Y2, how can we make sure that when we're trying to model X's responses that the first training sample is not simply Y1 -> X2 but also considers X1 as part of the context such that the model is trained to generate X2 given both X1 and Y1?
Based on the implementation in parlai/core/teachers.py, it doesn't seem like the ParlAIDialogTeacher using the ParlAI Dialog Format will handle this case, neither the Conversation Teacher with the Conversation Format.
I'm looking at the episodes from the empathetic dialogues dataset and the second episode below seems to start an episode without any context about the first turn of the conversation. Is it possible to provide the first turn as context, and if so, how can I do it with the ParlAI Dialog Format or Conversation Format? Even if I set
opt['label_turn']
to'both'
for the Conversation Teacher, I think this implementation shows that the first turn will be dropped for the second speaker.I know this bug misses at most one turn (X1) per conversation when modeling X's responses, but X1 may often contain important information relevant for the next responses. Also, I want to know if there may be a way to train a model to learn only a subset of turns while providing the full conversation history. For instance, if I have X1 -> Y1 -> X2 -> Y2 -> X3 -> Y3 and only want my model to learn Y3, is there a way to do that with ParlAI? More specifically, can I provide all the previous turns as context only with an understanding that X1 and X2 came from speaker X and Y1, Y2 came from speaker Y without actually providing them also as training samples?
Reproduction steps command:
parlai display_data -t empathetic_dialogues
Expected behavior
Additional context I want to make sure that the dialogue systems that I train with ParlAI are, for each turn, being trained with the full context that is available.