can you provide the raw train data?

vardaan123 / Corr-seq-labeling

Code for paper "Joint Learning of Correlated Sequence Labelling Tasks Using Bidirectional Recurrent Neural Networks"

11 stars 4 forks source link

can you provide the raw train data? #2

Open zzp-appleLover opened 5 years ago

zzp-appleLover commented 5 years ago

hello! I got the data from the url "https://wit3.fbk.eu/mt.php?release=2012-03" you provide, but I find the data is small than the article "Punctuation prediction for unsegmented transcript based on word vector", so I am a bit confused. Then can you provice the scripts to preprocess the raw data??

vardaan123 commented 5 years ago

I have the scripts but they aren't well-documented. There is a sequence of manual steps to perform. Why do you need the raw data? In case you want to do some other seq. labeling task, you can still use the text files with tokens.

zzp-appleLover commented 5 years ago

can you provice the scripts and raw data? I want to train model in other dataset not TED use your method, so I have to preprocess my data to your format. And, can you give me your test datas，I can't get them from the url you provice and the website. Thank you for your help!

vardaan123 commented 5 years ago

okay, I understand. Please give me a few days to get back to you for your request.

zzp-appleLover commented 5 years ago

Yeah, I am so grateful for your help. It's so important to me. And can you tell me that what's the number of '1 2 3 4 ' in the label file means? I am a little confused.