FYJNEVERFOLLOWS / ResNet-STFT-SSL

ResNet-STFT Model for Sound Source Localization

BSD 3-Clause "New" or "Revised" License

16 stars 5 forks source link

readme

ResNet-STFT-SSL

ResNet-STFT Model for Sound Source Localization

Unofficial PyTorch implementation of He's: Neural Network Adaptation and Data Augmentation for Multi-Speaker Direction-of-Arrival Estimation

Overview of ResNet-STFT

Dependency

PyTorch <https://pytorch.org/>
apkit <https://github.com/hwp/apkit>_ (version 0.2)

Data

We use the SSLR dataset <https://www.idiap.ch/dataset/sslr>_ for the experiments.

Usage

Run ./qsub/gen_data_frame_level.sh to extract features, then write them and the corresponding label into pickle file
Run ./qsub/train_with_CNN-STFT.sh and that's it. (If you don't want to choose the two-stage training strategy, you may Run ./qsub/train_with_CNN-STFT-wo2stage.sh and that's it.)