ilyachch / md_docs-trans-app

Application for translation documentation in MD format
MIT License
50 stars 14 forks source link

error:encode and decode #46

Closed Grozta closed 1 year ago

Grozta commented 1 year ago

`nnUNet :: Branch_v2.2  MdTranslate ❯ md-translate E:/Laboratory/FrameWork/nnUNet/documentation -F en -T zh -P deepl -X 12 -N -D

Error processing file: E:\Laboratory\FrameWork\nnUNet\documentation\dataset_format_inference.md 'gbk' codec can't decode byte 0x80 in position 584: illegal multibyte sequence Traceback (most recent call last): File "D:\ProgramData\Anaconda3\envs\MdTranslate\Lib\site-packages\md_translate\application.py", line 106, in process_file document = MarkdownDocument.from_file( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\ProgramData\Anaconda3\envs\MdTranslate\Lib\site-packages\md_translate\document\document.py", line 102, in from_file file_content = target_file.read_text() ^^^^^^^^^^^^^^^^^^^^^^^ File "D:\ProgramData\Anaconda3\envs\MdTranslate\Lib\pathlib.py", line 1059, in read_text return f.read() ^^^^^^^^ UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 584: illegal multibyte sequence Error processing file: E:\Laboratory\FrameWork\nnUNet\documentation\dataset_format.md 'gbk' codec can't decode byte 0x9c in position 1701: illegal multibyte sequence Traceback (most recent call last): File "D:\ProgramData\Anaconda3\envs\MdTranslate\Lib\site-packages\md_translate\application.py", line 106, in process_file document = MarkdownDocument.from_file( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\ProgramData\Anaconda3\envs\MdTranslate\Lib\site-packages\md_translate\document\document.py", line 102, in from_file file_content = target_file.read_text() ^^^^^^^^^^^^^^^^^^^^^^^ File "D:\ProgramData\Anaconda3\envs\MdTranslate\Lib\pathlib.py", line 1059, in read_text return f.read() ^^^^^^^^ UnicodeDecodeError: 'gbk' codec can't decode byte 0x9c in position 1701: illegal multibyte sequence Error processing file: E:\Laboratory\FrameWork\nnUNet\documentation\how_to_use_nnunet.md 'gbk' codec can't decode byte 0x80 in position 6455: illegal multibyte sequence Traceback (most recent call last): File "D:\ProgramData\Anaconda3\envs\MdTranslate\Lib\site-packages\md_translate\application.py", line 106, in process_file document = MarkdownDocument.from_file( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\ProgramData\Anaconda3\envs\MdTranslate\Lib\site-packages\md_translate\document\document.py", line 102, in from_file file_content = target_file.read_text() ^^^^^^^^^^^^^^^^^^^^^^^ File "D:\ProgramData\Anaconda3\envs\MdTranslate\Lib\pathlib.py", line 1059, in read_text return f.read() ^^^^^^^^ UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 6455: illegal multibyte sequence Error processing file: E:\Laboratory\FrameWork\nnUNet\documentation\setting_up_paths.md 'gbk' codec can't decode byte 0x80 in position 644: illegal multibyte sequence Traceback (most recent call last): File "D:\ProgramData\Anaconda3\envs\MdTranslate\Lib\site-packages\md_translate\application.py", line 106, in process_file document = MarkdownDocument.from_file( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\ProgramData\Anaconda3\envs\MdTranslate\Lib\site-packages\md_translate\document\document.py", line 102, in from_file file_content = target_file.read_text() ^^^^^^^^^^^^^^^^^^^^^^^ File "D:\ProgramData\Anaconda3\envs\MdTranslate\Lib\pathlib.py", line 1059, in read_text return f.read() ^^^^^^^^ UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 644: illegal multibyte sequence`

ilyachch commented 1 year ago

As said in error message, there is unknown symbol in document. As you are using Windows, maybe your document is in non-UTF encoding (Python is working with UTF-8 by default)? Can you try to save document in UTF-8 (this can be done in VS code, for example, with CTRL + Shift + P -> Change encoding -> Save with Encoding) and try again? Also, It would be great, If you could share the doc, you are trying to translate (or even part of it)

ilyachch commented 1 year ago

As there is no any response, I think, I can close the issue.