Open hqlin2018 opened 4 years ago
I would recommend contacting Huawei for their dataset. Or you can use some public dataset instead.
I would recommend contacting Huawei for their dataset. Or you can use some public dataset instead.
thanks for you advise, i have got the dataset from Huawei, the dataset have two package as repos.tar.gz and train.tar.gz .but now i don't know how use the dataset in the code. when i want to use the dataset , what i need to do process the dataset first?
Yes. You need to preprocess the data. Sorry that I didn't upload the preprocessing code. If I remember correctly, I used similar preprocessing with
https://www.tensorflow.org/tutorials/text/nmt_with_attention
which is using batches and adding MASK
to make batch to be the same length.
In fact, I believe almost all NLP systems use similar preprocessing.
I want to ask a question. Have you trained the vanilla seq-to-seq model with the whole dataset which contains 4000k sentences. I tried it and found it was difficult to converge when I use the vanilla seq-to-seq model.
Yes. You need to preprocess the data. Sorry that I didn't upload the preprocessing code. If I remember correctly, I used similar preprocessing with https://www.tensorflow.org/tutorials/text/nmt_with_attention which is using batches and adding
MASK
to make batch to be the same length.In fact, I believe almost all NLP systems use similar preprocessing.
Ok, thank you very much, i woudl try it again.
I want to ask a question. Have you trained the vanilla seq-to-seq model with the whole dataset which contains 4000k sentences. I tried it and found it was difficult to converge when I use the vanilla seq-to-seq model.
Sorry , i have no idea about vanilla seq2seq model.
I want to ask a question. Have you trained the vanilla seq-to-seq model with the whole dataset which contains 4000k sentences. I tried it and found it was difficult to converge when I use the vanilla seq-to-seq model.
Sorry , i have no idea about vanilla seq2seq model.
What about the model in this repository.
sorry, i didn't have a try
发自我的iPhone
------------------ Original ------------------ From: Zhenhao He <notifications@github.com> Date: Sun,Nov 10,2019 9:46 PM To: aodongli/DeepLearningFramework <DeepLearningFramework@noreply.github.com> Cc: hqlin2018 <hqlin2016@qq.com>, Author <author@noreply.github.com> Subject: Re: [aodongli/DeepLearningFramework] Dataset (#2)
I want to ask a question. Have you trained the vanilla seq-to-seq model with the whole dataset which contains 4000k sentences. I tried it and found it was difficult to converge when I use the vanilla seq-to-seq model.
Sorry , i have no idea about vanilla seq2seq model.
What about the model in this repository.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
i can not see any dataset in there ,could you tell how to get the dataset?
Could you sent me the dataset? I really need it for my research.Thank you so much!!!
are you a student ?
发自我的iPhone
------------------ Original ------------------ From: aaa <notifications@github.com> Date: Mon,Dec 30,2019 9:32 PM To: aodongli/DeepLearningFramework <DeepLearningFramework@noreply.github.com> Cc: hqlin2018 <hqlin2016@qq.com>, Author <author@noreply.github.com> Subject: Re: [aodongli/DeepLearningFramework] Dataset (#2)
i can not see any dataset in there ,could you tell how to get the dataset?
Could you sent me the dataset? I really need it for my research.Thank you so much!!!
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
yep,i'm a student from China University of Geosciences
------------------ 原始邮件 ------------------ 发件人: "hqlin2018"<notifications@github.com>; 发送时间: 2019年12月30日(星期一) 晚上9:39 收件人: "aodongli/DeepLearningFramework"<DeepLearningFramework@noreply.github.com>; 抄送: "黎尚雄"<554373559@qq.com>;"Comment"<comment@noreply.github.com>; 主题: Re: [aodongli/DeepLearningFramework] Dataset (#2)
are you a student ?
发自我的iPhone
------------------ Original ------------------ From: aaa <notifications@github.com> Date: Mon,Dec 30,2019 9:32 PM To: aodongli/DeepLearningFramework <DeepLearningFramework@noreply.github.com> Cc: hqlin2018 <hqlin2016@qq.com>, Author <author@noreply.github.com> Subject: Re: [aodongli/DeepLearningFramework] Dataset (#2)
i can not see any dataset in there ,could you tell how to get the dataset?
Could you sent me the dataset? I really need it for my research.Thank you so much!!!
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
i can not see any dataset in there ,could you tell how to get the dataset?