yihong0618 / bilingual_book_maker

Make bilingual epub books Using AI translate
MIT License
7.3k stars 1.04k forks source link

我看見說#237好像已順利完成?那搞單語輸出要怎樣搞? #243

Open myforself opened 1 year ago

myforself commented 1 year ago

如題,有時想要搞單語輸出可以怎樣做?要用甚麼指令?

hleft commented 1 year ago

有时readme没更新,可以自己更新到最新版,然后-h查看帮助 这里要 --single_translate(目前只支持epub, 因为我不用txt就没改txt)

myforself commented 1 year ago

有时readme没更新,可以自己更新到最新版,然后-h查看帮助 这里要 --single_translate(目前只支持epub, 因为我不用txt就没改txt)

那已跑完的怎麼辦

myforself commented 1 year ago

有时readme没更新,可以自己更新到最新版,然后-h查看帮助 这里要 --single_translate(目前只支持epub, 因为我不用txt就没改txt)

用 git clone https://github.com/yihong0618/bilingual_book_maker.git Cloning into 'bilingual_book_maker'... remote: Enumerating objects: 692, done. remote: Counting objects: 100% (61/61), done. remote: Compressing objects: 100% (37/37), done. Receiving objects: 100% (692/692), 1.13 MiB | 7.95 MiB/s, done. Resolving deltas: 100% (404/404), done.

然後python make_book.py -h usage: make_book.py [-h] [--book_name BOOK_NAME] [--book_from E-READER] [--device_path DEVICE_PATH] [--openai_key OPENAI_KEY] [--caiyun_key CAIYUN_KEY] [--deepl_key DEEPL_KEY] [--test] [--test_num TEST_NUM] [-m MODEL] [--language LANGUAGE] [--resume] [-p PROXY] [--deployment_id DEPLOYMENT_ID] [--api_base API_BASE] [--exclude_filelist EXCLUDE_FILELIST] [--only_filelist ONLY_FILELIST] [--translate-tags TRANSLATE_TAGS] [--exclude_translate-tags EXCLUDE_TRANSLATE_TAGS] [--allow_navigable_strings] [--prompt PROMPT_ARG] [--accumulated_num ACCUMULATED_NUM] [--translation_style TRANSLATION_STYLE] [--batch_size BATCH_SIZE] [--retranslate RETRANSLATE RETRANSLATE RETRANSLATE RETRANSLATE]

options: -h, --help show this help message and exit --book_name BOOK_NAME path of the epub file to be translated --book_from E-READER e-reader type, available: {kobo} --device_path DEVICE_PATH Path of e-reader device --openai_key OPENAI_KEY OpenAI api key,if you have more than one key, please use comma to split them to go beyond the rate limits --caiyun_key CAIYUN_KEY you can apply caiyun key from here (https://dashboard.caiyunapp.com/user/sign_in/) --deepl_key DEEPL_KEY you can apply deepl key from here (https://rapidapi.com/splintPRO/api/deepl-translator --test only the first 10 paragraphs will be translated, for testing --test_num TEST_NUM how many paragraphs will be translated for testing -m MODEL, --model MODEL model to use, available: {chatgptapi, gpt3, google, caiyun, deepl} --language LANGUAGE language to translate to, available: {af, am, ar, as, az, ba, be, bg, bn, bo, br, bs, ca, cs, cy, da, de, el, en, es, et, eu, fa, fi, fo, fr, gl, gu, ha, haw, he, hi, hr, ht, hu,
hy, id, is, it, ja, jw, ka, kk, km, kn, ko, la, lb, ln, lo, lt, lv, mg, mi, mk, ml, mn, mr, ms, mt, my, ne, nl, nn, no, oc, pa, pl, ps, pt, ro, ru, sa, sd, si, sk, sl, sn, so,
sq, sr, su, sv, sw, ta, te, tg, th, tk, tl, tr, tt, uk, ur, uz, vi, yi, yo, zh, zh-hans, zh-hant, zh-yue, Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani,
Bashkir, Basque, Belarusian, Bengali, Bosnian, Breton, Bulgarian, Burmese, Cantonese, Castilian, Catalan, Croatian, Czech, Danish, Dutch, English, Estonian, Faroese, Finnish,
Flemish, French, Galician, Georgian, German, Greek, Gujarati, Haitian, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Lao, Latin, Latvian, Letzeburgesch, Lingala, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi,
Moldavian, Moldovan, Mongolian, Myanmar, Nepali, Norwegian, Nynorsk, Occitan, Panjabi, Pashto, Persian, Polish, Portuguese, Punjabi, Pushto, Romanian, Russian, Sanskrit, Serbian,
Shona, Simplified Chinese, Sindhi, Sinhala, Sinhalese, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tagalog, Tajik, Tamil, Tatar, Telugu, Thai, Tibetan,
Traditional Chinese, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, Valencian, Vietnamese, Welsh, Yiddish, Yoruba} --resume if program stop unexpected you can use this to resume -p PROXY, --proxy PROXY use proxy like http://127.0.0.1:7890 --deployment_id DEPLOYMENT_ID the deployment name you chose when you deployed the model --api_base API_BASE specify base url other than the OpenAI's official API address --exclude_filelist EXCLUDE_FILELIST if you have more than one file to exclude, please use comma to split them, example: --exclude_filelist 'nav.xhtml,cover.xhtml' --only_filelist ONLY_FILELIST if you only have a few files with translations, please use comma to split them, example: --only_filelist 'nav.xhtml,cover.xhtml' --translate-tags TRANSLATE_TAGS example --translate-tags p,blockquote --exclude_translate-tags EXCLUDE_TRANSLATE_TAGS example --exclude_translate-tags table,sup --allow_navigable_strings allow NavigableStrings to be translated --prompt PROMPT_ARG used for customizing the prompt. It can be the prompt template string, or a path to the template file. The valid placeholders are {text} and {language}. --accumulated_num ACCUMULATED_NUM Wait for how many tokens have been accumulated before starting the translation. gpt3.5 limits the total_token to 4090. For example, if you use --accumulated_num 1600, maybe
openai will output 2200 tokens and maybe 200 tokens for other messages in the system messages user messages, 1600+2200+200=4000, So you are close to reaching the limit. You have
to choose your own value, there is no way to know if the limit is reached before sending --translation_style TRANSLATION_STYLE ex: --translation_style "color: #808080; font-style: italic;" --batch_size BATCH_SIZE how many lines will be translated by aggregated translation(This options currently only applies to txt files) --retranslate RETRANSLATE RETRANSLATE RETRANSLATE RETRANSLATE --retranslate "$translated_filepath" "file_name_in_epub" "start_str" "end_str"(optional) Retranslate from start_str to end_str's tag: python3 "make_book.py" --book_name "test_books/animal_farm.epub" --retranslate 'test_books/animal_farm_bilingual.epub' 'index_split_002.html' 'in spite of the present book shortage which' 'This kind of thing is
not a good symptom. Obviously' Retranslate start_str's tag: python3 "make_book.py" --book_name "test_books/animal_farm.epub" --retranslate 'test_books/animal_farm_bilingual.epub'
'index_split_002.html' 'in spite of the present book shortage which'

沒--single_translate啊…

hleft commented 1 year ago

@myforself 你什么时候git clone的? 看起来没更新到最新。-h最后一行应该是--single_translate (以前跑完的现在还没功能变成单语的)

myforself commented 1 year ago

@myforself 你什么时候git clone的? 看起来没更新到最新。-h最后一行应该是--single_translate

就是今天我看到你說「有时readme没更新,可以自己更新到最新版,然后-h查看帮助 这里要 --single_translate(目前只支持epub, 因为我不用txt就没改txt)」就跑去更了

hleft commented 1 year ago

@myforself 你在git clone下来的目录执行git pull , 输出发出来看看

myforself commented 1 year ago

git pull

PS D:\bilingual_book_maker-main> git pull fatal: not a git repository (or any of the parent directories): .git

hleft commented 1 year ago

@myforself 是在bilingual_book_maker目录里面git pull

myforself commented 1 year ago

@myforself 是在bilingual_book_maker目录里面git pull

PS D:\bilingual_book_maker-main\bilingual_book_maker> git pull
Already up to date.

hleft commented 1 year ago

奇怪 在这个目录执行git log, 发下输出

myforself commented 1 year ago

奇怪 在这个目录执行git log, 发下输出

PS D:\bilingual_book_maker-main\bilingual_book_maker> git log commit a0c999a2e67f4a9c269f4e2dc19f18e567bd9b1e (HEAD -> main, origin/main, origin/HEAD) Author: hleft 89069008+hleft@users.noreply.github.com Date: Sun Apr 9 21:20:41 2023 +0800

support single_translate in epub (#237)

* support no_bilingual

* use single_translate instead no_bilingual

commit 237ce5280ad4787040d5c05a865c5c7edf330656 Author: yihong zouzou0208@gmail.com Date: Sat Apr 8 17:45:14 2023 +0800

fix: #239 #238 (#240)

commit 1b64fe031585d3c468f7a370298957ce4e05202e Author: yihong zouzou0208@gmail.com Date: Mon Apr 3 20:02:26 2023 +0800

fix: drop chatgpt account (#234)

commit 8cf4124986962153ebca99e92b00dd2eb211e1bd Author: yihong zouzou0208@gmail.com Date: Sun Apr 2 22:00:01 2023 +0800 :

hleft commented 1 year ago

现在就是最新版.. 按道理现在-h最后一行应该是 --single_translate ..

myforself commented 1 year ago

现在就是最新版.. 按道理现在-h最后一行应该是 --single_translate ..

所以我才上來問嘛:(

hleft commented 1 year ago

因为确实很奇怪 我只能怀疑比如你git clone了新目录 但还在老目录执行老版本 或者你之前-h时候还没更新到最新,现在更新到最新了,再-h看看还有没有 应该是git操作的问题吧 这软件功能没出问题的

myforself commented 1 year ago

因为确实很奇怪 我只能怀疑比如你git clone了新目录 但还在老目录执行老版本 或者你之前-h时候还没更新到最新,现在更新到最新了,再-h看看还有没有 应该是git操作的问题吧 这软件功能没出问题的

PS D:\bilingual_book_maker-main>

usage: make_book.py [-h] [--book_name BOOK_NAME] [--book_from E-READER] [--device_path DEVICE_PATH] [--openai_key OPENAI_KEY] [--caiyun_key CAIYUN_KEY] [--deepl_key DEEPL_KEY] [--test] [--test_num TEST_NUM] [-m MODEL] [--language LANGUAGE] [--resume] [-p PROXY] [--deployment_id DEPLOYMENT_ID] [--api_base API_BASE] [--exclude_filelist EXCLUDE_FILELIST] [--only_filelist ONLY_FILELIST] [--translate-tags TRANSLATE_TAGS]
[--exclude_translate-tags EXCLUDE_TRANSLATE_TAGS] [--allow_navigable_strings] [--prompt PROMPT_ARG] [--accumulated_num ACCUMULATED_NUM] [--translation_style TRANSLATION_STYLE] [--batch_size BATCH_SIZE]
[--retranslate RETRANSLATE RETRANSLATE RETRANSLATE RETRANSLATE]

options: -h, --help show this help message and exit --book_name BOOK_NAME path of the epub file to be translated --book_from E-READER e-reader type, available: {kobo} --device_path DEVICE_PATH Path of e-reader device --openai_key OPENAI_KEY OpenAI api key,if you have more than one key, please use comma to split them to go beyond the rate limits --caiyun_key CAIYUN_KEY you can apply caiyun key from here (https://dashboard.caiyunapp.com/user/sign_in/) --deepl_key DEEPL_KEY you can apply deepl key from here (https://rapidapi.com/splintPRO/api/deepl-translator --test only the first 10 paragraphs will be translated, for testing --test_num TEST_NUM how many paragraphs will be translated for testing -m MODEL, --model MODEL model to use, available: {chatgptapi, gpt3, google, caiyun, deepl} --language LANGUAGE language to translate to, available: {af, am, ar, as, az, ba, be, bg, bn, bo, br, bs, ca, cs, cy, da, de, el, en, es, et, eu, fa, fi, fo, fr, gl, gu, ha, haw, he, hi, hr, ht, hu, hy, id, is, it, ja,
jw, ka, kk, km, kn, ko, la, lb, ln, lo, lt, lv, mg, mi, mk, ml, mn, mr, ms, mt, my, ne, nl, nn, no, oc, pa, pl, ps, pt, ro, ru, sa, sd, si, sk, sl, sn, so, sq, sr, su, sv, sw, ta, te, tg, th, tk, tl,
tr, tt, uk, ur, uz, vi, yi, yo, zh, zh-hans, zh-hant, zh-yue, Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Bashkir, Basque, Belarusian, Bengali, Bosnian, Breton, Bulgarian,
Burmese, Cantonese, Castilian, Catalan, Croatian, Czech, Danish, Dutch, English, Estonian, Faroese, Finnish, Flemish, French, Galician, Georgian, German, Greek, Gujarati, Haitian, Haitian Creole,
Hausa, Hawaiian, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Lao, Latin, Latvian, Letzeburgesch, Lingala, Lithuanian, Luxembourgish,
Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Moldavian, Moldovan, Mongolian, Myanmar, Nepali, Norwegian, Nynorsk, Occitan, Panjabi, Pashto, Persian, Polish, Portuguese, Punjabi,
Pushto, Romanian, Russian, Sanskrit, Serbian, Shona, Simplified Chinese, Sindhi, Sinhala, Sinhalese, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tagalog, Tajik, Tamil, Tatar,
Telugu, Thai, Tibetan, Traditional Chinese, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, Valencian, Vietnamese, Welsh, Yiddish, Yoruba} --resume if program stop unexpected you can use this to resume -p PROXY, --proxy PROXY use proxy like http://127.0.0.1:7890 --deployment_id DEPLOYMENT_ID the deployment name you chose when you deployed the model --exclude_filelist EXCLUDE_FILELIST if you have more than one file to exclude, please use comma to split them, example: --exclude_filelist 'nav.xhtml,cover.xhtml' --only_filelist ONLY_FILELIST if you only have a few files with translations, please use comma to split them, example: --only_filelist 'nav.xhtml,cover.xhtml' --translate-tags TRANSLATE_TAGS example --translate-tags p,blockquote --exclude_translate-tags EXCLUDE_TRANSLATE_TAGS example --exclude_translate-tags table,sup --allow_navigable_strings allow NavigableStrings to be translated --prompt PROMPT_ARG used for customizing the prompt. It can be the prompt template string, or a path to the template file. The valid placeholders are {text} and {language}. --accumulated_num ACCUMULATED_NUM Wait for how many tokens have been accumulated before starting the translation. gpt3.5 limits the total_token to 4090. For example, if you use --accumulated_num 1600, maybe openai will output 2200
tokens and maybe 200 tokens for other messages in the system messages user messages, 1600+2200+200=4000, So you are close to reaching the limit. You have to choose your own value, there is no way to
know if the limit is reached before sending --translation_style TRANSLATION_STYLE ex: --translation_style "color: #808080; font-style: italic;" --batch_size BATCH_SIZE how many lines will be translated by aggregated translation(This options currently only applies to txt files) --retranslate RETRANSLATE RETRANSLATE RETRANSLATE RETRANSLATE --retranslate "$translated_filepath" "file_name_in_epub" "start_str" "end_str"(optional) Retranslate from start_str to end_str's tag: python3 "make_book.py" --book_name "test_books/animal_farm.epub"
--retranslate 'test_books/animal_farm_bilingual.epub' 'index_split_002.html' 'in spite of the present book shortage which' 'This kind of thing is not a good symptom. Obviously' Retranslate start_str's tag: python3 "make_book.py" --book_name "test_books/animal_farm.epub" --retranslate 'test_books/animal_farm_bilingual.epub' 'index_split_002.html' 'in spite of the present book shortage
which' PS D:\bilingual_book_maker-main> python make_book.py -h usage: make_book.py [-h] [--book_name BOOK_NAME] [--book_from E-READER] [--device_path DEVICE_PATH] [--openai_key OPENAI_KEY] [--caiyun_key CAIYUN_KEY] [--deepl_key DEEPL_KEY] [--test] [--test_num TEST_NUM] [-m MODEL] [--language LANGUAGE] [--resume] [-p PROXY] [--deployment_id DEPLOYMENT_ID] [--api_base API_BASE] [--exclude_filelist EXCLUDE_FILELIST] [--only_filelist ONLY_FILELIST] [--translate-tags TRANSLATE_TAGS]
[--exclude_translate-tags EXCLUDE_TRANSLATE_TAGS] [--allow_navigable_strings] [--prompt PROMPT_ARG] [--accumulated_num ACCUMULATED_NUM] [--translation_style TRANSLATION_STYLE] [--batch_size BATCH_SIZE]
[--retranslate RETRANSLATE RETRANSLATE RETRANSLATE RETRANSLATE]

options: -h, --help show this help message and exit --book_name BOOK_NAME path of the epub file to be translated --book_from E-READER e-reader type, available: {kobo} --device_path DEVICE_PATH Path of e-reader device --openai_key OPENAI_KEY OpenAI api key,if you have more than one key, please use comma to split them to go beyond the rate limits --caiyun_key CAIYUN_KEY you can apply caiyun key from here (https://dashboard.caiyunapp.com/user/sign_in/) --deepl_key DEEPL_KEY you can apply deepl key from here (https://rapidapi.com/splintPRO/api/deepl-translator --test only the first 10 paragraphs will be translated, for testing --test_num TEST_NUM how many paragraphs will be translated for testing -m MODEL, --model MODEL model to use, available: {chatgptapi, gpt3, google, caiyun, deepl} --language LANGUAGE language to translate to, available: {af, am, ar, as, az, ba, be, bg, bn, bo, br, bs, ca, cs, cy, da, de, el, en, es, et, eu, fa, fi, fo, fr, gl, gu, ha, haw, he, hi, hr, ht, hu, hy, id, is, it, ja,
jw, ka, kk, km, kn, ko, la, lb, ln, lo, lt, lv, mg, mi, mk, ml, mn, mr, ms, mt, my, ne, nl, nn, no, oc, pa, pl, ps, pt, ro, ru, sa, sd, si, sk, sl, sn, so, sq, sr, su, sv, sw, ta, te, tg, th, tk, tl,
tr, tt, uk, ur, uz, vi, yi, yo, zh, zh-hans, zh-hant, zh-yue, Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Bashkir, Basque, Belarusian, Bengali, Bosnian, Breton, Bulgarian,
Burmese, Cantonese, Castilian, Catalan, Croatian, Czech, Danish, Dutch, English, Estonian, Faroese, Finnish, Flemish, French, Galician, Georgian, German, Greek, Gujarati, Haitian, Haitian Creole,
Hausa, Hawaiian, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Lao, Latin, Latvian, Letzeburgesch, Lingala, Lithuanian, Luxembourgish,
Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Moldavian, Moldovan, Mongolian, Myanmar, Nepali, Norwegian, Nynorsk, Occitan, Panjabi, Pashto, Persian, Polish, Portuguese, Punjabi,
Pushto, Romanian, Russian, Sanskrit, Serbian, Shona, Simplified Chinese, Sindhi, Sinhala, Sinhalese, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tagalog, Tajik, Tamil, Tatar,
Telugu, Thai, Tibetan, Traditional Chinese, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, Valencian, Vietnamese, Welsh, Yiddish, Yoruba} --resume if program stop unexpected you can use this to resume -p PROXY, --proxy PROXY use proxy like http://127.0.0.1:7890 --deployment_id DEPLOYMENT_ID the deployment name you chose when you deployed the model --api_base API_BASE specify base url other than the OpenAI's official API address --exclude_filelist EXCLUDE_FILELIST if you have more than one file to exclude, please use comma to split them, example: --exclude_filelist 'nav.xhtml,cover.xhtml' --only_filelist ONLY_FILELIST if you only have a few files with translations, please use comma to split them, example: --only_filelist 'nav.xhtml,cover.xhtml' --translate-tags TRANSLATE_TAGS example --translate-tags p,blockquote --exclude_translate-tags EXCLUDE_TRANSLATE_TAGS example --exclude_translate-tags table,sup --allow_navigable_strings allow NavigableStrings to be translated --prompt PROMPT_ARG used for customizing the prompt. It can be the prompt template string, or a path to the template file. The valid placeholders are {text} and {language}. --accumulated_num ACCUMULATED_NUM Wait for how many tokens have been accumulated before starting the translation. gpt3.5 limits the total_token to 4090. For example, if you use --accumulated_num 1600, maybe openai will output 2200
tokens and maybe 200 tokens for other messages in the system messages user messages, 1600+2200+200=4000, So you are close to reaching the limit. You have to choose your own value, there is no way to
know if the limit is reached before sending --translation_style TRANSLATION_STYLE ex: --translation_style "color: #808080; font-style: italic;" --batch_size BATCH_SIZE how many lines will be translated by aggregated translation(This options currently only applies to txt files) --retranslate RETRANSLATE RETRANSLATE RETRANSLATE RETRANSLATE --retranslate "$translated_filepath" "file_name_in_epub" "start_str" "end_str"(optional) Retranslate from start_str to end_str's tag: python3 "make_book.py" --book_name "test_books/animal_farm.epub"
--retranslate 'test_books/animal_farm_bilingual.epub' 'index_split_002.html' 'in spite of the present book shortage which' 'This kind of thing is not a good symptom. Obviously' Retranslate start_str's tag: python3 "make_book.py" --book_name "test_books/animal_farm.epub" --retranslate 'test_books/animal_farm_bilingual.epub' 'index_split_002.html' 'in spite of the present book shortage
which'

                    沒有啊…我明明就有更。肯定都是在這個目錄的
hleft commented 1 year ago

或者你再试试 https://codeload.github.com/yihong0618/bilingual_book_maker/zip/refs/heads/main 复制这个到浏览器 下载zip解压出来试试-h , 再没有我也没办法..

samuelL912 commented 1 year ago

抱歉提個問:20230520 更新後,單語輸出 --single translate 功能是不是被拿掉了?已經跑 -h 指令,發現裡面已經沒有 --single translate 功能。 Question:I've updated the file on 2023/05/20. Does someone canceled the " --single translate " function? I already ran -h, and there were no " --single translate" function anymore.

以下是我試跑 --single translate 後,跳出來的錯誤訊息: This is what I found when I tried to run "--single translate":

usage: make_book.py [-h] [--book_name BOOK_NAME] [--book_from E-READER] [--device_path DEVICE_PATH] [--openai_key OPENAI_KEY] [--caiyun_key CAIYUN_KEY] [--deepl_key DEEPL_KEY] [--test] [--test_num TEST_NUM] [-m MODEL] [--language LANGUAGE] [--resume] [-p PROXY] [--api_base API_BASE] [--translate-tags TRANSLATE_TAGS] [--allow_navigable_strings] [--prompt PROMPT_ARG] [--accumulated_num ACCUMULATED_NUM] [--batch_size BATCH_SIZE] make_book.py: error: unrecognized arguments: --single language

如果確定已經取消的話,想請問版上大神,不知道有沒有機會利用回復舊版本之類的方法,找回有 "--single translate" 功能的版本呢? If the function is canceled, I wonder is it possible to recovered? May be through downloading old version?

謝謝願意回答分享的大神! Thanks for anyone who are willing to share solutions or opinions with me!