thu-coai / EVA

EVA: Large-scale Pre-trained Chit-Chat Models
MIT License
305 stars 51 forks source link

请问role embeds参数的初始化问题 #76

Closed Charles-ux-bit closed 1 year ago

Charles-ux-bit commented 1 year ago

如图,我在加载huggingface模型时遇到we如下问题,请问是否会影响解码效果?在eva_interactive.py中也有这个现象。谢谢。 Some weights of EVAModel were not initialized from the model checkpoint at thu-coai/EVA1.0 and are newly initialized: ['decoder.role_embeds.weight', 'role_embeds.weight', 'encoder.role_embeds.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Some weights of EVAModel were not initialized from the model checkpoint at thu-coai/EVA2.0-base and are newly initialized: ['decoder.role_embeds.weight', 'role_embeds.weight', 'encoder.role_embeds.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Some weights of EVAModel were not initialized from the model checkpoint at thu-coai/EVA2.0-large and are newly initialized: ['decoder.role_embeds.weight', 'role_embeds.weight', 'encoder.role_embeds.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference

t1101675 commented 1 year ago

我们最终没有使用 role embeds,这几个参数实际上没有参与计算,所以不会影响解码效果

Charles-ux-bit commented 1 year ago

好的,非常感谢。但是我在实际使用中(使用--load参数指定thu-coai/EVA1.0, thu-coai/EVA1.0, thu-coai/EVA2.0-base, thu-coai/EVA2.0-large和thu-coai/EVA2.0-xlarge),但是多轮对话的效果都不是特别好,以下是一些例子。请问这个是正常的情况么?谢谢。

eva 1.0(使用了arguments.py中的所有默认参数) 1

eva 2.0 large(使用了arguments.py中的默认参数,除了将number of beam search修改为5) 2

t1101675 commented 1 year ago

EVA1.0 是正常情况。这个是训练数据的问题,因为里面包含了比较多的电商数据,所以对于“您好”的回复很大概率是客服场景的,可以试试“今天天气怎么样”这种问题。这部分训练数据在2.0的时候已经去除了。

EVA2.0 large 看起来不太正常,一般回复不会这么长。可以换一些问题试试,或者将长度惩罚调大一点

Charles-ux-bit commented 1 year ago

好的。我把num_beam调整为1,感觉正常了。请问EVA模型的最佳解码方案,可以指导一下吗~另外关于知识问答方面,感觉能力还不是非常好,例如问 "北京在哪里",回复是 "北京在心里",我们的模型是使用对话数据train from scratch的,以及数据中没有太多类似于知识问答数据的,是吧?十分感谢!

Jiaxin-Wen commented 1 year ago
  1. 最佳解码方案可以参见论文,但也可以适当调整。
  2. 是的
Charles-ux-bit commented 1 year ago

好的,非常感谢。