tensorflow / nmt

TensorFlow Neural Machine Translation Tutorial
Apache License 2.0
6.35k stars 1.96k forks source link

UnicodeEncodeError: 'charmap' codec can't encode character #402

Open ranjita-naik opened 5 years ago

ranjita-naik commented 5 years ago

Traceback (most recent call last): File "C:\Anaconda3\envs\tensorflow\lib\runpy.py", line 184, in _run_module_as_main "main", mod_spec) File "C:\Anaconda3\envs\tensorflow\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "D:\nmt\nmt\nmt.py", line 703, in tf.app.run(main=main, argv=[sys.argv[0]] + unparsed) File "C:\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run _sys.exit(main(argv)) File "D:\nmt\nmt\nmt.py", line 696, in main run_main(FLAGS, default_hparams, train_fn, inference_fn) File "D:\nmt\nmt\nmt.py", line 689, in run_main train_fn(hparams, target_session=target_session) File "D:\nmt\nmt\train.py", line 512, in train sample_tgt_data, avg_ckpts) File "D:\nmt\nmt\train.py", line 338, in run_full_eval sample_src_data, sample_tgt_data) File "D:\nmt\nmt\train.py", line 53, in run_sample_decode infer_model.batch_size_placeholder, summary_writer) File "D:\nmt\nmt\train.py", line 698, in _sample_decode utils.print_out(" src: %s" % src_data[decode_id]) File "D:\nmt\nmt\utils\misc_utils.py", line 69, in print_out print(out_s, end="", file=sys.stdout) File "C:\Anaconda3\envs\tensorflow\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u0111' in position 13: character maps to

I'm running this on a windows server.

ranjita-naik commented 5 years ago

I see that in misc_utils.py :line 66, 's' is utf-8 encoded. However, the next statement - if not isinstance(out_s, str): is decoding it and as a result subsequent print statement is throwing the UnicodeEnocode error.

workaround - adding encode("utf-8") to print sttmt, fixed the issue for me. print(out_s.encode("utf-8"), end="", file=sys.stdout)

ranjita-naik commented 5 years ago

by the way, i'm using python 3.5