Reproduction of Interspeech'18 result

abhinavjain03 / kaldi-accentsmultitask

2 stars 3 forks source link

Reproduction of Interspeech'18 result #2

Open mcernak opened 6 years ago

mcernak commented 6 years ago

Hello,

Thank you very much for preparing this reproducible research.

You said in your IS2018 paper that you use this data split: https://sites.google.com/view/accentsunearthed-dhvani/ So train, dev, test, testindian, testnz subsets.

As the data preparation steps are missing, it is hard to guess which data sets you actually used. For example, following multitask_run_2_base_2.sh, it is not clear what data is data/101-recog-min , data/102-cla-min and these subsets cv_train_nz, cv_trainx_nz, cv_dev_nz, cv_test_onlynz. Could you please make it clear?

Thanks, Milos

Minali24 commented 6 years ago

Hi,

So, we have two tasks, one for accent recognition and the other for phoneme recognition. So, we have created two directories, the ones with "101-recog-min" or "101-recognition" in their names contains the data or alignments for the "phoneme recognition" task, and the ones with "102-class-min" in their names contains data or alignments for "accent recognition" task. The data directories for both are same but their alignments are different. For "phoneme recognition" task alignments are pointing to phoneme ids but for "accent recognition" task alignments are pointing to accent ids.

Regarding data-splits: "cv_train_nz" is Train-7 "cv_dev_nz" is Dev-4 "cv_test_onlynz" is Test-NZ Please ignore "cv_trainx_nz". This is same as "cv_train_nz" i.e. Train-7. This is because the data directories for both the tasks are same, the difference comes in alignments as stated earlier.

Thanks,

abhinavjain03 commented 6 years ago

We have also updated the README for better understanding. It should also resolve future queries.

mcernak commented 6 years ago

Hi, thanks. However, although the splits are now clarified, the scripts cannot be followed. You use some other models, such as /home/abhinav/kaldi/accents/exp. Howe did you train it, on what data? Why didn't you commit the data preparation scripts, incl. how did you create the alignments and so far? How did you prepare the lang data?

Without data preparation, nothing can be replicated.

Best, Milos

abhinavjain03 commented 6 years ago

Hi Milos, The entire work published is actually divided into three repositories. One of which is this one. You can refer the readme of all the repos in sequence which can give you an overview of what is needed to be done. It should be sufficient to reproduce the results with some prior knowledge. Moreover, we wanted to make the modeling related scripts available soon as possible. A clean version of the entire pipeline is in the works and willtake some time before being ready for public use. We are working on releasing this soon.