i3thuan5 / tai5-uan5_gian5-gi2_kang1-ku7

臺灣言語工具
https://i3thuan5.github.io/tai5-uan5_gian5-gi2_kang1-ku7
Other
107 stars 32 forks source link

做VAD #287

Closed sih4sing5hong5 closed 7 years ago

sih4sing5hong5 commented 8 years ago

試過,毋過袂使的方法

        from praatinterface import PraatLoader
        pl = PraatLoader(basic2=cc2, basic=cc,basic3='echo sui2')
        pl.reinit_scripts()
        音檔 = 聲音檔.對檔案讀(self.音檔所在)
        音懸結果 = pl.run_script('basic.praat', self.音檔所在)
        音懸資料 = sorted(pl.read_praat_out(音懸結果).items())
        formant結果 = pl.run_script('formants.praat', self.音檔所在,5,5500)
        print(formant結果)
        formant資料 = sorted(pl.read_praat_out(formant結果).items())
        音框秒數 = 0.2
        這馬時間 = 0.0
        while True:
            音懸數 = 0
            有音 = 0
            for 秒數, 音懸 in 音懸資料:
                if 這馬時間 <= 秒數 and 秒數 < 這馬時間 + 音框秒數:
                    音懸數 += 1
                    if 音懸['Pitch'] > 0:
                        有音 += 1
            formant數 = 0
            formant夠懸 = 0
            for 秒數, formant in formant資料:
                if 這馬時間 <= 秒數 and 秒數 < 這馬時間 + 音框秒數:
                    formant數 += 1
                    if formant['F1'] < 1000:
                        formant夠懸 += 1
            if 音懸數 == 0:
                break
            if 有音 >= 音懸數 * 0.35 or formant夠懸 >= formant數*0.5:
                print(True, 有音, 音懸數,formant夠懸, formant數)
            else:
                print(False, 有音, 音懸數,formant夠懸, formant數)
            這馬時間 += 音框秒數
sih4sing5hong5 commented 8 years ago

可能自kaldi的lat來做

sih4sing5hong5 commented 8 years ago

猶未看 https://shiweipku.gitbooks.io/chinese-doc-of-kaldi/content/lattice.html http://codingandlearning.blogspot.tw/2014/01/kaldi-lattices.html https://groups.google.com/forum/#!topic/kaldi-help/6pS5F5pPZAU https://github.com/foundintranslation/Kaldi/blob/master/egs/wsj/s5/steps/lmrescore.sh https://sourceforge.net/p/kaldi/discussion/1355348/thread/65558c35/?limit=25 https://sourceforge.net/p/kaldi/mailman/message/33229487/ https://sourceforge.net/p/kaldi/discussion/1355347/thread/741f7d3e/ https://sourceforge.net/p/kaldi/discussion/1355348/thread/f06bc0a6/

sih4sing5hong5 commented 7 years ago

原來praat就會當做矣 Sound: To TextGrid (silences)...

sih4sing5hong5 commented 7 years ago

討論word-position-dependent-phones

sih4sing5hong5 commented 6 years ago

舊的文件,因為會當直接用LIUM

針測信號(Voice activity detection, VAD)

切語料時會需要。查資料的結果,是用praat就可以得著誠好的結果矣

  1. 用praat GUI程式,minimun pitch自100Hz調整到60Hz,slicence threshold需要看音檔調整,其他會使用預設參數
  2. praat script,需要參考Sound: To TextGrid (silences)...佮相像的script

https://github.com/sih4sing5hong5/tai5-uan5_gian5-gi2_kang1-ku7/pull/345#issuecomment-280814577