Open rotcx opened 1 year ago
if we could not set such a non-translating vocab for translators (google, tencent ... )
the only way is to remedy it replace the (wrongly) translated words to the origin EN word after translation ...
An impl could be:
from functools import reduce
replace_dict = {"法学硕士": "LLM", "变压器": "Transformer", "代币":"token"}
text_final = reduce(lambda text, kv: text.replace(*kv), replace_dict.items(), text_final)
Another (downstream way) is to proc the translated main.tex file:
#!/bin/bash
declare -A replace_dict=(["法学硕士"]="LLM" ["变压器"]="Transformer" ["代币"]="token")
while read -r line; do
for key in "${!replace_dict[@]}"; do
line=${line//${key}/${replace_dict[$key]}}
done
echo $line
done < main.tex
iter all .tex files of directory dir and proc (as we could not in general not know which .tex is the main tex file?):
#!/bin/bash
declare -A replace_dict=(["法学硕士"]="LLM" ["变压器"]="Transformer" ["代币"]="token")
find dir -name "*.tex" | while read -r file; do
while read -r line; do
for key in "${!replace_dict[@]}"; do
line=${line//${key}/${replace_dict[$key]}}
done
echo $line
done < "$file"
done
Thank you for reporting issues to us. Since we are a general translation tool instead of a tool only working for CS or DL, we think it might be better to leave it as what it is temporarily. We could consider a functionality as a "user dictionary", by asking the users to manually define the "popular vocabulary". The only thing user need is to load a list of vocabulary. Similar to your solution here but more systematic and friendly to users. @SUSYUSTC
e.g., do not translate LLM to 法学硕士. Leave it as LLM.
e.g., do not Transformer LLM to 变压器. Leave it as Transformer.