训练好后测试显示乱码

qhduan / just_another_seq2seq

Just another seq2seq repo

329 stars 97 forks source link

Open fire717 opened 6 years ago

fire717 commented 6 years ago

我是在windows下跑的，跑完后测试时的样例句子显示：鐣鍗鍚渚

然后我encode为gbk又显示[b'\xe7\x95', b'\xe5\x8d', b'\xe5\x90', b'\xe4\xbe']

最后我在linux环境下测试，同样显示：鐣鍗鍚渚

求问作者的训练环境和测试环境（不会是因为不该在windows下训练吧。。。）

fire717 commented 6 years ago

我知道了，在extract_conv.py里open时应该加一个encoding:'utf-8' 不知道作者的环境，我是win10+py3 以及ubuntu+py3 改了之后都可以了

qhduan commented 6 years ago

因为windows默认编码不是utf-8，其他文件都是

所以windows默认会有点问题

fire717 commented 6 years ago

好吧可能是我先下到windows再传到ubuntu的也不行

yaleimeng commented 6 years ago

可能你在Windows下打开编辑过，再保存会改编码的。我也是下到windows再传到ubuntu解压缩的，执行demo没问题。