Open Evraa opened 2 years ago
Error:
Traceback (most recent call last):
File "generate.py", line 283, in
You can try to lower the version of transformers. You can try transformers==3.1.0.
Thank you for your fast reply.
It worked. But got stuck in another problem concerning versions, I guess.
Current versions: torch = 1.7.0 transformers = 3.1.0 pickle = 4.0 regex = 2.5.103
Error:
Traceback (most recent call last):
File "generate.py", line 283, in
Please, don't refer me to issue #2 . It didn't work for me, also it's in Chinese, and I'm not familiar with the language :'D.
thank you
You can try regex==2018.1.10. I should work.
Thank you very much.
Could you please tell me how to structure test.refs.txt, is it just sentences separated by '\n' ? and what is the minimum/maximum number of utterances allowed?
thank you in advance
The structure is like the following: utterance1 EOS utterance2 EOS utterance3 \t reference1 \t reference 2 \t reference 3 \t\n There is no specific constraint for minimum/maximum number of utterances.
I'm sorry, what are "reference" ? Could you provide an example?
Also, this error came up!
Traceback (most recent call last):
File "generate.py", line 298, in
In the dialogue dataset, there are many possible responses. The responses collected in advance are the references. For example: utterance1: What's your hobby? reference1: I like basketball. reference2: Reading. What about you? reference3: Tell me yours first.
For the error, I never meet this. Maybe you can try downgrade the transformers to 3.0.0 or 2.7.0. I'm not sure.
Worked with transformers
version 3.0.0
Not 2.7.0 nor 3.1.0
One last question, what exactly does "generate.py" script produce?
taking Dialogue and some reference responses .. what exactly is the output hypstr[0]
?
thanks in advance
Follow up question, the variable "responses" in "generate.py" ln: 293. What is the purpose of it?
Hi Zekang,
Thanks for providing the source code.
I just followed this post. Suppose I don't want to use the older version of transformers
, due to python environment issue, then pre-training DialoFlow
based on GPT-2 using DailyDialog is also possible, although the results would be worse than the reported ones. Right? The effectiveness of DialoFlow
is independent of if it was pre-trained on Reddit dataset.
Thanks in advance.
Best,
Dong
Follow up question, the variable "responses" in "generate.py" ln: 293. What is the purpose of it?
Hi Evram,
Based on my understanding, the evaluation task is to generate a response given a context. For example,
context = [utterance1], response = [[utterance2], [utterance3], [utterance4], ....].
I have another question regarding the 'generation.py' script and to see if you are willing to answer it. Given speaker1's utterance, how do we know which utterances correspond to speaker2 and speaker1's next response? As it is multi-turn dialogue generation, I suppose the output should contain more than one utterance.
Please see the following example, where a context, ground-truth responses, and generated responses are shown. For the generated responses, the correspondence is not clear to me.
["We've managed to reduce our energy consumption in our factory by about 15 per cent in the last two years ."]
["That's excellent . How have you managed that ?", "Mainly because we've invested in a heat recovery system .", 'What does that mean exactly ?', 'Well , we use the exhaust gases from our printing presses to provide energy to heat our dryers .', 'What other sources of energy do you use ?', "We don't use any fossil fuels . Most of our power comes from hydro-electric plants . We're hoping to use even more energy from alternative sources in the future - perhaps even wind power ."]
["Does that mean that we can't afford to pay for more? We can't afford to pay for more than we can afford. Why not? We can't afford to pay for more. Why can't we? We can't afford to pay for more."]
As the input indices are [speaker1, text1, eos, empty, speaker2, text2, eos, empty]
, one potential way is to comment off the following code in the 'generation.py' script. But I am not sure if it is correct or not.
# if o in [eos, empty, speaker1, speaker2]:
# continue
Looking forward to hearing from you @Evraa, as well as @lizekang.
Best,
Dong
Hi Zekang,
Thanks for providing the source code.
I just followed this post. Suppose I don't want to use the older version of
transformers
, due to python environment issue, then pre-trainingDialoFlow
based on GPT-2 using DailyDialog is also possible, although the results would be worse than the reported ones. Right? The effectiveness ofDialoFlow
is independent of if it was pre-trained on Reddit dataset.Thanks in advance.
Best,
Dong
Hi, sorry for the late response. The effectiveness of DialoFlow
is independent. It should be also effective without pre-training on Reddit dataset.
Hi Zekang, Thanks for providing the source code. I just followed this post. Suppose I don't want to use the older version of
transformers
, due to python environment issue, then pre-trainingDialoFlow
based on GPT-2 using DailyDialog is also possible, although the results would be worse than the reported ones. Right? The effectiveness ofDialoFlow
is independent of if it was pre-trained on Reddit dataset. Thanks in advance. Best, DongHi, sorry for the late response. The effectiveness of
DialoFlow
is independent. It should be also effective without pre-training on Reddit dataset.
OK. Thanks a lot!
Follow up question, the variable "responses" in "generate.py" ln: 293. What is the purpose of it?
Hi Evram,
Based on my understanding, the evaluation task is to generate a response given a context. For example,
context = [utterance1], response = [[utterance2], [utterance3], [utterance4], ....].
I have another question regarding the 'generation.py' script and to see if you are willing to answer it. Given speaker1's utterance, how do we know which utterances correspond to speaker2 and speaker1's next response? As it is multi-turn dialogue generation, I suppose the output should contain more than one utterance.
Please see the following example, where a context, ground-truth responses, and generated responses are shown. For the generated responses, the correspondence is not clear to me.
["We've managed to reduce our energy consumption in our factory by about 15 per cent in the last two years ."] ["That's excellent . How have you managed that ?", "Mainly because we've invested in a heat recovery system .", 'What does that mean exactly ?', 'Well , we use the exhaust gases from our printing presses to provide energy to heat our dryers .', 'What other sources of energy do you use ?', "We don't use any fossil fuels . Most of our power comes from hydro-electric plants . We're hoping to use even more energy from alternative sources in the future - perhaps even wind power ."] ["Does that mean that we can't afford to pay for more? We can't afford to pay for more than we can afford. Why not? We can't afford to pay for more. Why can't we? We can't afford to pay for more."]
As the input indices are
[speaker1, text1, eos, empty, speaker2, text2, eos, empty]
, one potential way is to comment off the following code in the 'generation.py' script. But I am not sure if it is correct or not.# if o in [eos, empty, speaker1, speaker2]: # continue
Looking forward to hearing from you @Evraa, as well as @lizekang.
Best,
Dong
For the corresponding to different speakers, there are two ways: 1) we insert special tokens like speaker1
and speaker2
. 2) we use different segment embeddings for different speakers (function build_input_from_input generate.py).
For this question, during the generation, we don't want to see the model generates special tokens inside the response.
# if o in [eos, empty, speaker1, speaker2]:
# continue
Follow up question, the variable "responses" in "generate.py" ln: 293. What is the purpose of it?
Hi Evram, Based on my understanding, the evaluation task is to generate a response given a context. For example,
context = [utterance1], response = [[utterance2], [utterance3], [utterance4], ....].
I have another question regarding the 'generation.py' script and to see if you are willing to answer it. Given speaker1's utterance, how do we know which utterances correspond to speaker2 and speaker1's next response? As it is multi-turn dialogue generation, I suppose the output should contain more than one utterance. Please see the following example, where a context, ground-truth responses, and generated responses are shown. For the generated responses, the correspondence is not clear to me.
["We've managed to reduce our energy consumption in our factory by about 15 per cent in the last two years ."] ["That's excellent . How have you managed that ?", "Mainly because we've invested in a heat recovery system .", 'What does that mean exactly ?', 'Well , we use the exhaust gases from our printing presses to provide energy to heat our dryers .', 'What other sources of energy do you use ?', "We don't use any fossil fuels . Most of our power comes from hydro-electric plants . We're hoping to use even more energy from alternative sources in the future - perhaps even wind power ."] ["Does that mean that we can't afford to pay for more? We can't afford to pay for more than we can afford. Why not? We can't afford to pay for more. Why can't we? We can't afford to pay for more."]
As the input indices are
[speaker1, text1, eos, empty, speaker2, text2, eos, empty]
, one potential way is to comment off the following code in the 'generation.py' script. But I am not sure if it is correct or not.# if o in [eos, empty, speaker1, speaker2]: # continue
Looking forward to hearing from you @Evraa, as well as @lizekang. Best, Dong
For the corresponding to different speakers, there are two ways: 1) we insert special tokens like
speaker1
andspeaker2
. 2) we use different segment embeddings for different speakers (function build_input_from_input generate.py).For this question, during the generation, we don't want to see the model generates special tokens inside the response.
# if o in [eos, empty, speaker1, speaker2]: # continue
Thank you for your prompt reply and to see if I can explain my confusion clearly.
For the corresponding to different speakers, there are two ways: 1) we insert special tokens like
speaker1
andspeaker2
. 2) we use different segment embeddings for different speakers (function build_input_from_input generate.py).
I agree on this part. During training, the input index is defined as [speaker1, text1, eos, empty, speaker2, text2, eos, empty]
, which shows the correspondence.
But it might be different from that during generation. For example, given the first utterance from speaker1, the input index [speaker1, text1, eos, empty]
is provided, as well as the segment index.
Please see the segment result from your code, where '0' corresponds to speaker1 and '1' corresponds to speaker2. Longer outputs exhibit a similar pattern (just more 1's).
tensor([[0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])
My confusion: as this task is multi-turn dialogue generation, the output is essentially one-round dialogue. The model doesn't know how to further switch the speaker identity, while it knows 0 -->1, as the information is provided by the user.
if len(conv) % 2 == 1:
current_output = [speaker2]
else:
current_output = [speaker1]
During training, ground-truth sequences are provided, while during generation, sequences are generated in an autoregressive manner.
Greetings,
Actually I'm surprised that such an error came up, my problem lies with this line
model = torch.load("models/DialoFlow_large/model.bin")
model.bin is placed appropriately, and EC2 works with cuda 11.2 and pytorch = 1.9.
Where would the problem come from?
Thanks in advance