Closed Hopkins1 closed 8 years ago
Thanks for the improvements. As I have only little time for the project, I will use your modifications for the update.
OK - sounds good. I tested the code on python 2.7 and 3.5 and everything seemed OK.
If there is going to be a version update, I also see that the TWPhrasesIT.txt file in OpenCC has been updated. It could be a good chance to update TWPhrasesIT.txt and TWPrases.txt in this project.
When running a conversion of "s2twp", the results for opencc-python do not always match those for OpenCC. For example: OpenCC: "一干 " -> "一干 " opencc-python: "一干 " -> "一幹 "
Note: It appears that the opencc-python conversion chain does not honor "group" tag in the configuration file. The chain is [TWVariantsRevPhrases.txt, TWVariantsRev.txt, TWPhrasesRev.txt, TSPhrases.txt, TSCharacters.txt] The chain should be [[TWVariantsRevPhrases.txt, TWVariantsRev.txt], TWPhrasesRev.txt, [TSPhrases.txt, TSCharacters.txt]]
I've made changes to example.py and opencc.py appear to fix the problem The implementation is ~6x faster. Because of the large changes, I've decided to just attach the modified files rather than try creating a branch.
example.py.zip
opencc.py.zip