zhongkaifu / Seq2SeqSharp

Seq2SeqSharp is a tensor based fast & flexible deep neural network framework written by .NET (C#). It has many highlighted features, such as automatic differentiation, different network types (Transformer, LSTM, BiLSTM and so on), multi-GPUs supported, cross-platforms (Windows, Linux, x86, x64, ARM), multimodal model for text and images and so on.
Other
193 stars 38 forks source link

Sample data files? Demo? #1

Closed usptact closed 5 years ago

usptact commented 6 years ago

Thanks for sharing the code! Are you planning adding sample data files and/or input/output data formats?

Thanks a lot!

zhongkaifu commented 5 years ago

Hi @usptact ,

I will add sample data files later.

Here is data format. The corpus contains each sentence per line. For example: Parallel corpus in English and Chinese. Let's use three letters name CHS for Chinese and ENU for English, so we could have corpus files, such as train01.enu.snt, train01.chs.snt, train02.enu.snt and train02.chs.snt.

In train01.enu.snt, assume we have below two sentences: the children huddled together for warmth . the car business is constantly changing .

So, train01.chs.snt has the corresponding translated sentences: 孩子 们 挤 成 一 团 以 取暖 . 汽车 业 也 在 不断 地 变化 .

usptact commented 5 years ago

@zhongkaifu Thank you very much! I was able to successfully train my model and do the predictions.