FYJNEVERFOLLOWS / ResNet-STFT-SSL

ResNet-STFT Model for Sound Source Localization
BSD 3-Clause "New" or "Revised" License
16 stars 5 forks source link

ResNet-STFT-SSL

ResNet-STFT Model for Sound Source Localization

Unofficial PyTorch implementation of He's: Neural Network Adaptation and Data Augmentation for Multi-Speaker Direction-of-Arrival Estimation

Overview of ResNet-STFT

Dependency

Data

We use the SSLR dataset <https://www.idiap.ch/dataset/sslr>_ for the experiments.

Usage

  1. Run ./qsub/gen_data_frame_level.sh to extract features, then write them and the corresponding label into pickle file
  2. Run ./qsub/train_with_CNN-STFT.sh and that's it. (If you don't want to choose the two-stage training strategy, you may Run ./qsub/train_with_CNN-STFT-wo2stage.sh and that's it.)