troublemaker-r / Chinese_Coreference_Resolution

基于SpanBert的中文指代消解,pytorch实现
95 stars 20 forks source link

UnicodeEncodeError: 'charmap' codec can't encode characters in position 25-29: character maps to <undefined> #20

Open learner-crapy opened 1 year ago

learner-crapy commented 1 year ago

不知有没有人遇到过这个问题,在ubuntu上跑的好好的,在一台windows上也跑得好好的,文件open时加的编码是utf-8,在另一台国外电脑上就报这个错 image

mtang398 commented 1 year ago

你最后怎么解决的?

learner-crapy commented 1 year ago

你好,我将所有涉及到open操作的代码行都加上了encoding =“utf-8”,另外有的用于输出日志的log语句也会报这个错,将其注释掉就可以了。

希望我的方案对你有帮助。

On Wed, May 3, 2023 at 08:19 mtang398 @.***> wrote:

你最后怎么解决的?

— Reply to this email directly, view it on GitHub https://github.com/troublemaker-r/Chinese_Coreference_Resolution/issues/20#issuecomment-1532306492, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQUJKCHWGQBKDVS7PCFMGTDXEGQBNANCNFSM6AAAAAAWKHQJW4 . You are receiving this because you authored the thread.Message ID: @.*** .com>

mtang398 commented 1 year ago

非常感谢你的回复,我刚刚只做了第一步,然后我把coreference.py里面的“similarity = sum(scores[matching[:, 0], matching[:, 1]])”改成了“similarity = sum(scores[matching[i, 0], matching[i, 1]] for i in range(matching.shape[0])) ”就跑起来了,你说的那个第二个我没注释掉(不知道后面会不会有什么问题)。