josebeo2016 / biosegment

The supporting project for BTS-E
0 stars 0 forks source link

Training Sound Segmentation Unit #6

Open HildaNya opened 2 weeks ago

HildaNya commented 2 weeks ago

Hi! I was able to reproduce you results on the ASV-2019 LA dataset and the performance was excellent. I'm now trying to work on some other datasets. Would I be able to find a script for training the GMM models for the sound segmentation unit?

Thanks!

josebeo2016 commented 2 weeks ago

The training script is located at GMM_breath.py You might need to preprocess the data before training. The steps as follows:

  1. Prepare your data. Please find the data I have prepared in https://github.com/josebeo2016/biosegment/tree/main/data_new as an example
  2. Run preprocess script
    python preprocess.py ./data ./GMM/out
  3. Run training script
    cd GMM/
    python GMM_breath.py ./out/ ./out/
HildaNya commented 2 weeks ago

Follow-up question: I'm guessing the file "utt2spk" and "segments" files in https://github.com/josebeo2016/biosegment/tree/main/data_new hold the manually marked breath/speech/silence segments. (If this could be disclosed) I wonder what you used in manually marking the segments? Would it be some form of rule-based power/frequency technique?

josebeo2016 commented 2 weeks ago

The annotation progress is done manually, following the findings described in this paper: https://dl.acm.org/doi/abs/10.1016/j.specom.2018.01.008