felixkreuk / UnsupSeg

Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation (INTERSPEECH 2020)
MIT License
137 stars 31 forks source link

Seems not work on speechocean dataset #9

Open louislau1129 opened 2 years ago

louislau1129 commented 2 years ago

Hi, @felixkreuk , first thank you for open-sourced such good repo on unsupervised phoneme segmentation. Recently, I conduct several experiments on SpeechOcean 762 dataset, which is a standard speech scoring dataset.

  1. First, I directly apply the provided pretrained boundary detection model on this corpus, and only found about 50% F1 and R value.
  2. I suspect this may relates to the domain mismatch problem, so I try to re-train this boundary detection model on the SpeechOcean corpus from scratch, but still attains about 50% F1 and R value, it is far lag from the referenced force aligned boundary result.

The following result screenshot is about using the random initialized model (without start training) to predict: guKmKj9ga2

The following result screenshot is about using the trained model to predict: Z0bB4fn1HS

Any idea on fixing this issue? Thanks in advance!

louislau1129 commented 2 years ago

(Ps: I have re-produce the reported result on timit using the code