Closed usptact closed 5 years ago
Hi @usptact ,
I will add sample data files later.
Here is data format. The corpus contains each sentence per line. For example: Parallel corpus in English and Chinese. Let's use three letters name CHS for Chinese and ENU for English, so we could have corpus files, such as train01.enu.snt, train01.chs.snt, train02.enu.snt and train02.chs.snt.
In train01.enu.snt, assume we have below two sentences: the children huddled together for warmth . the car business is constantly changing .
So, train01.chs.snt has the corresponding translated sentences: 孩子 们 挤 成 一 团 以 取暖 . 汽车 业 也 在 不断 地 变化 .
@zhongkaifu Thank you very much! I was able to successfully train my model and do the predictions.
Thanks for sharing the code! Are you planning adding sample data files and/or input/output data formats?
Thanks a lot!