Dataset - Githubissues

shvdiwnkozbw / Multi-Source-Sound-Localization

This repo aims to perform sound localization in complex audiovisual scenes, where there multiple objects making sounds.

79 stars 15 forks source link

Dataset #9

Open Zhengxl25 opened 2 years ago

Zhengxl25 commented 2 years ago

Hi shvdiwnkozbw！ Thank you so much for sharing your code. Can you provide the AudioSet Instrument dataset?Looking forward to your reply!

shvdiwnkozbw commented 2 years ago

Sorry, the AudioSet Instrument dataset is used in previous server, no access to the data itself currently. But it can be filtered by the audio category labels in original AudioSet dataset.

Devin-Pi commented 1 year ago

Hi shvdiwnkozbw！Thanks for sharing your work. I would like to ask the "pairs" in the file "./prepare_data/data.py" is the "./utils/dataset.pkl". Or should I prepare the list representing audio-visual pairs index by myself according to my datasets? Looking forward to your reply! Thank you.

shvdiwnkozbw commented 1 year ago

Yes, the pairs depend on the data you use for training. You are expected to generate it according to your dataset.

Devin-Pi commented 1 year ago

Yes, the pairs depend on the data you use for training. You are expected to generate it according to your dataset.

Wow, thank you, shvdiwnkozbw! What's more, I just found a problem happened in "./prepare/data.py", as shown below. Based on Google, the reason for this problem is that the length for those "audios" are different. So, have you encountered this problem and how did you solve this? And can I solve this problem according to this method shown in the link https://stackoverflow.com/questions/52841335/how-can-i-pad-wav-file-to-specific-length/70948409#70948409 . Thank you! Looking forward to your reply.

shvdiwnkozbw commented 1 year ago

I have not encountered this problem. It might be due to the specific data you use. I think it should be fine to solve the problem referring to this link.

Devin-Pi commented 1 year ago

I have not encountered this problem. It might be due to the specific data you use. I think it should be fine to solve the problem referring to this link.

Thanks for your response!