I am Chinese, very love beancount, also want to use smart_import to enhance experiences when I import Bank statements. I try this tool, but you know, Chinese did not have break or space within words. So the SVM cannot analyze Chinese now. Here is a Chinese sentence.
eg. "我和小明一起吃晚饭。"
We are rely on tokenizer tool to split words. So I using the most popular tokenizer tools jieba to support this function.
I am sure, my code can let the smart_import more smart. I wrote some code and test, but in a rude way.
If you have any suggestion or feedback, very welcome write down here.
Hi Contributors,
I am Chinese, very love beancount, also want to use
smart_import
to enhance experiences when I import Bank statements. I try this tool, but you know, Chinese did not have break or space within words. So the SVM cannot analyze Chinese now. Here is a Chinese sentence.eg. "我和小明一起吃晚饭。"
We are rely on tokenizer tool to split words. So I using the most popular tokenizer tools jieba to support this function.
I am sure, my code can let the smart_import more smart. I wrote some code and test, but in a rude way.
If you have any suggestion or feedback, very welcome write down here.
Thanks.