IFICL / SLfM

Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation
https://ificl.github.io/SLfM/
MIT License
34 stars 8 forks source link

Question about dataset #3

Closed kth0522 closed 10 months ago

kth0522 commented 10 months ago

Could you specify which versions of the LibriSpeech, FMA, and HMD3D datasets I should download?

example) LibriSpeech: original-mp3, FMA: fma-full, hmd3d: hm3d_train_full

IFICL commented 10 months ago

Hi, I will answer your question about datasets: 1) LibriSpeech: test-clean + train-clean-100. We resplit them into train/val/test set. 2) FMA: fma_large.zip. We resplit them into train/val/test set. 3) hm3d: hm3d_train + hm3d_val. You need to download the semantic annotation as well. As a result, you will only have 100 scenes with semantic annotation. It seems HM3D annotated more scenes after that. You can use the data-split to filter out the scenes that have been used.