aodongli / DeepLearningFramework

This repository is the reproduced code of Neural Responding Machine for Short-Text Conversation (https://www.aclweb.org/anthology/P15-1152 ) using deep learning framework including theano and tensorflow. It is also a Machine Translation framework.
15 stars 4 forks source link

Dataset #2

Open hqlin2018 opened 4 years ago

hqlin2018 commented 4 years ago

i can not see any dataset in there ,could you tell how to get the dataset?

aodongli commented 4 years ago

I would recommend contacting Huawei for their dataset. Or you can use some public dataset instead.

hqlin2018 commented 4 years ago

I would recommend contacting Huawei for their dataset. Or you can use some public dataset instead.

thanks for you advise, i have got the dataset from Huawei, the dataset have two package as repos.tar.gz and train.tar.gz .but now i don't know how use the dataset in the code. when i want to use the dataset , what i need to do process the dataset first?

aodongli commented 4 years ago

Yes. You need to preprocess the data. Sorry that I didn't upload the preprocessing code. If I remember correctly, I used similar preprocessing with https://www.tensorflow.org/tutorials/text/nmt_with_attention which is using batches and adding MASK to make batch to be the same length.

In fact, I believe almost all NLP systems use similar preprocessing.

scoyer commented 4 years ago

I want to ask a question. Have you trained the vanilla seq-to-seq model with the whole dataset which contains 4000k sentences. I tried it and found it was difficult to converge when I use the vanilla seq-to-seq model.

hqlin2018 commented 4 years ago

Yes. You need to preprocess the data. Sorry that I didn't upload the preprocessing code. If I remember correctly, I used similar preprocessing with https://www.tensorflow.org/tutorials/text/nmt_with_attention which is using batches and adding MASK to make batch to be the same length.

In fact, I believe almost all NLP systems use similar preprocessing.

Ok, thank you very much, i woudl try it again.

hqlin2018 commented 4 years ago

I want to ask a question. Have you trained the vanilla seq-to-seq model with the whole dataset which contains 4000k sentences. I tried it and found it was difficult to converge when I use the vanilla seq-to-seq model.

Sorry , i have no idea about vanilla seq2seq model.

scoyer commented 4 years ago

I want to ask a question. Have you trained the vanilla seq-to-seq model with the whole dataset which contains 4000k sentences. I tried it and found it was difficult to converge when I use the vanilla seq-to-seq model.

Sorry , i have no idea about vanilla seq2seq model.

What about the model in this repository.

hqlin2018 commented 4 years ago

sorry, i didn't have a try 

发自我的iPhone

------------------ Original ------------------ From: Zhenhao He <notifications@github.com> Date: Sun,Nov 10,2019 9:46 PM To: aodongli/DeepLearningFramework <DeepLearningFramework@noreply.github.com> Cc: hqlin2018 <hqlin2016@qq.com>, Author <author@noreply.github.com> Subject: Re: [aodongli/DeepLearningFramework] Dataset (#2)

I want to ask a question. Have you trained the vanilla seq-to-seq model with the whole dataset which contains 4000k sentences. I tried it and found it was difficult to converge when I use the vanilla seq-to-seq model.

Sorry , i have no idea about vanilla seq2seq model.

What about the model in this repository.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

lsx1995 commented 4 years ago

i can not see any dataset in there ,could you tell how to get the dataset?

Could you sent me the dataset? I really need it for my research.Thank you so much!!!

hqlin2018 commented 4 years ago

are you a student ?

发自我的iPhone

------------------ Original ------------------ From: aaa <notifications@github.com> Date: Mon,Dec 30,2019 9:32 PM To: aodongli/DeepLearningFramework <DeepLearningFramework@noreply.github.com> Cc: hqlin2018 <hqlin2016@qq.com>, Author <author@noreply.github.com> Subject: Re: [aodongli/DeepLearningFramework] Dataset (#2)

i can not see any dataset in there ,could you tell how to get the dataset?

Could you sent me the dataset? I really need it for my research.Thank you so much!!!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

lsx1995 commented 4 years ago

yep,i'm a student from China University of Geosciences

------------------ 原始邮件 ------------------ 发件人: "hqlin2018"<notifications@github.com>; 发送时间: 2019年12月30日(星期一) 晚上9:39 收件人: "aodongli/DeepLearningFramework"<DeepLearningFramework@noreply.github.com>; 抄送: "黎尚雄"<554373559@qq.com>;"Comment"<comment@noreply.github.com>; 主题: Re: [aodongli/DeepLearningFramework] Dataset (#2)

are you a student ?

发自我的iPhone

------------------ Original ------------------ From: aaa <notifications@github.com&gt; Date: Mon,Dec 30,2019 9:32 PM To: aodongli/DeepLearningFramework <DeepLearningFramework@noreply.github.com&gt; Cc: hqlin2018 <hqlin2016@qq.com&gt;, Author <author@noreply.github.com&gt; Subject: Re: [aodongli/DeepLearningFramework] Dataset (#2)

i can not see any dataset in there ,could you tell how to get the dataset?

Could you sent me the dataset? I really need it for my research.Thank you so much!!!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.