javinizer / Javinizer

(NSFW) Organize your local Japanese Adult Video (JAV) library
MIT License
608 stars 63 forks source link

Feature Request - Translation Preprocessing #304

Open counterProductive0 opened 3 years ago

counterProductive0 commented 3 years ago

A little pre-processing could improve translations for descriptions and titles. Search through .nfo files for ○ or ●, and create a csv or dict for replacements. Usually, the (○,●) is simply 'n', which is ん in hiragana, or ン in katakana. I could possibly do this. It looks like src/Javinizer/Private/Get-TranslatedString.ps1 on line 16, a pre-process python replacement call could be done prior to calling the translation python files.

I used this from: https://stackoverflow.com/questions/6116978/how-to-replace-multiple-substrings-of-a-string/15448887

import re

# if small, rep could be a dictionary. or a could pull from a csv file. A
rep = {"♪": "", "マ○コ": "マンコ", "チ○ポ": "チンポ", "ち○ぽ":"ちんぽ", "マ●コ": "マンコ", "チ●ポ": "チンポ", "ち●ぽ":"ちんぽ", "生ち○": "生ちん", "生ち●": "生ちん", "ロー●ス・ロイス": "ロールス・ロイス", "Y●uTube": "YouTube", "ビヨ●セ": "ビヨンセ", "ビヨ○セ": "ビヨンセ"} 

# use these three lines to do the replacement
rep = dict((re.escape(k), v) for k, v in rep.iteritems()) 
#Python 3 renamed dict.iteritems to dict.items so use rep.items() for latest versions
pattern = re.compile("|".join(rep.keys()))
text = pattern.sub(lambda m: rep[re.escape(m.group(0))], text)

In addition:

1) MGStage includes weird tags in descriptions: Ex: (br)(br)(b)(-b)(font color="red")

2) MGStage title and original title need trimmed. Currently have: :MGS動画<プレステージ グループ>アダルト動画配信サイト at the end

jvlflame commented 3 years ago

Hey, this is a good suggestion! I'll take a look into it for the next release.

Tykimheng commented 3 years ago

A little pre-processing could improve translations for descriptions and titles. Search through .nfo files for ○ or ●, and create a csv or dict for replacements. Usually, the (○,●) is simply 'n', which is ん in hiragana, or ン in katakana. I could possibly do this. It looks like src/Javinizer/Private/Get-TranslatedString.ps1 on line 16, a pre-process python replacement call could be done prior to calling the translation python files.

Tykimheng commented 3 years ago

Java