DATASET - Githubissues

turpaultn / DCASE2019_task4

Baseline of dcase 2019 task 4

59 stars 27 forks source link

DATASET #4

Closed MichelleYang2017 closed 5 years ago

MichelleYang2017 commented 5 years ago

I am sorry to say that the download process is always zero.So I'd like to ask for your help.Is there any other way to download the dataset except run .py　file.Thank you

turpaultn commented 5 years ago

I'm sorry, there is no other way to download the data, you need to download the biggest part like this. For certain countries people had to use a vpn and the --proxy option of youtube-dl. Is youtube available in your country ? (here is one file from the dataset: https://www.youtube.com/watch?v=00pK0GMmE9s)

MichelleYang2017 commented 5 years ago

Thank you.I will try.

MichelleYang2017 commented 5 years ago

Hello,I am sorry to disturb you again. I found that I use vpn but I still have a lot audio undownload.

turpaultn commented 5 years ago

Hi, I don't know what do you mean by "a lot" but usually if you have around 600-1200 files; it is normal. It is written in the README file that you have to send me a mail with the missing_files attached (you should have 3 (weak, validation, unlabel_in_domain)).

MichelleYang2017 commented 5 years ago

In three folders,unlabel_in_domain miss ten thousands audios. I will retry.If I miss around 600 files,I will contact with you.Can you please give me your email if possible?Thank you again.

turpaultn commented 5 years ago

Indeed it is really too much. Does it download new data when you relaunch the script ? My email: nicolas.turpault@inria.fr

MichelleYang2017 commented 5 years ago

Thank you.I have relunch the script again and agin.Now there are only about 700 files are missed.I have send to your email. Thank you again.

turpaultn commented 5 years ago

Perfect, I'm glad to hear, it finally worked. I'm sorry for the inconvenience.