twgo / siann1-hak8_boo5-hing5

聲學模型訓練
MIT License
1 stars 1 forks source link

求 Review perturb 改成 alaw/mulaw encode 的script #53

Closed leo424y closed 6 years ago

leo424y commented 6 years ago

由於對語法不甚熟,怕擔誤實驗進度,我把可能需要改的部分列出,隊長方便的話,請協助修正此 8k, 8k a-law, 8k mu-law 3混合的script,我會再試著改成2種混合的

目前我的理解是,一樣用sox處理音訊,只需動到wav.scp,其它照原script即能copy,故將改wav.scp的code改成我們要的alaw/mulaw復製到新的資料夾,故可能要動的碼如此commit,還請指導:

https://github.com/twgo/siann1-hak8_boo5-hing5/commit/347309234195802922422825fa36fead7ed167a7

sih4sing5hong5 commented 6 years ago

我把可能需要改的部分列出

無看--著--neh

leo424y commented 6 years ago

@sih4sing5hong5 抱歉,補上我改的script,我把輸入參數0.9, 1.1 改成alaw,mulaw 並置換wav.scp為我們的encode script ,目前可執行到走評估,請隊長review其正確性 https://github.com/twgo/siann1-hak8_boo5-hing5/blob/83a31f021554653787aa7627cf2001e5210817a1/utils/data/perturb_data_dir_speed.sh#L67-L97

sih4sing5hong5 commented 6 years ago

好,我有閒來看 你先khǹg--luē走,我先看DOCKER ê內容

leo424y commented 6 years ago

照這個script跑,結果仍在正負1%內 回去查了一下 perturb 結果是 34.16 也沒差異

高老師今天信說,目標格式是alaw, 或許mulaw相關實驗重要性就不高了。

訓練\測試 仝8K 仝8K_alaw 仝8K_ulaw
8k 35.02(#128) 34.94(#129) 34.86(#130)
8k_a 35.96(#131) 36.04(#134) 36.44(#136)
8k_u 35.45(#133) 35.34(#135) 36.04(#137)
8k+8k_a (#) (#) (#)
8k+8k_u (#) (#) (#)
8k_a+8k_u (#) (#) (#)
8k+8k_a+8k_u 35.92(#138) 35.77(#139) (#)
sih4sing5hong5 commented 6 years ago

歹勢,tsa̋ng無時間看

  1. 你應該是自localhost:5000/siann1-hak8_boo5-hing5:129來ê?
  2. 檔名愛改,mài用perturb_data_dir_speed.sh
  3. 請比較localhost:5000/siann1-hak8_boo5-hing5:129kahlocalhost:5000/siann1-hak8_boo5-hing5:87êwc train_guan5/wav.scp train_guan5/wav.scp
leo424y commented 6 years ago
  1. 是的,#138 是自localhost:5000/siann1-hak8_boo5-hing5:129來的

啊,謝謝隊長指出要點,原來被我註解掉的script有幾個是必要的

  1. 我已加回並改掉speed用的字為encode

新改的script如,再請隊長撥冗檢視

https://github.com/twgo/siann1-hak8_boo5-hing5/blob/432ac86eab7b29197f62b87d695e54ccb8bed81d/utils/data/perturb_data_dir_encode.sh#L67-L105

  1. 比較129, 87 train_guan5 無異狀(129多了轉8k)但 train 則少了reco2file_and_channel 且wav.scp沒有修改到

129

tong0000000 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1000.wav -b 16 -c 1 -r 16k -t wav - | avconv -i - -f wav -ar 8000 - |
tong0000001 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1010.wav -b 16 -c 1 -r 16k -t wav - | avconv -i - -f wav -ar 8000 - |
tong0000002 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1020.wav -b 16 -c 1 -r 16k -t wav - | avconv -i - -f wav -ar 8000 - |
tong0000003 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1030.wav -b 16 -c 1 -r 16k -t wav - | avconv -i - -f wav -ar 8000 - |
tong0000004 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1040.wav -b 16 -c 1 -r 16k -t wav - | avconv -i - -f wav -ar 8000 - |
tong0000005 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1050.wav -b 16 -c 1 -r 16k -t wav - | avconv -i - -f wav -ar 8000 - |
tong0000006 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1060.wav -b 16 -c 1 -r 16k -t wav - | avconv -i - -f wav -ar 8000 - |
tong0000007 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1070.wav -b 16 -c 1 -r 16k -t wav - | avconv -i - -f wav -ar 8000 - |
tong0000008 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1080.wav -b 16 -c 1 -r 16k -t wav - | avconv -i - -f wav -ar 8000 - |
tong0000009 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1090.wav -b 16 -c 1 -r 16k -t wav - | avconv -i - -f wav -ar 8000 - |

87

root@62c3c9659c5c:/usr/local/kaldi/egs/taiwanese/s5c/data/train_guan5# head wav.scp
tong0000000 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1000.wav -b 16 -c 1 -r 16k -t wav - |
tong0000001 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1010.wav -b 16 -c 1 -r 16k -t wav - |
tong0000002 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1020.wav -b 16 -c 1 -r 16k -t wav - |
tong0000003 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1030.wav -b 16 -c 1 -r 16k -t wav - |
tong0000004 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1040.wav -b 16 -c 1 -r 16k -t wav - |
tong0000005 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1050.wav -b 16 -c 1 -r 16k -t wav - |
tong0000006 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1060.wav -b 16 -c 1 -r 16k -t wav - |
tong0000007 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1070.wav -b 16 -c 1 -r 16k -t wav - |
tong0000008 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1080.wav -b 16 -c 1 -r 16k -t wav - |
tong0000009 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1090.wav -b 16 -c 1 -r 16k -t wav - |
sih4sing5hong5 commented 6 years ago

比較129, 87 train_guan5 無異狀(129多了轉8k)但 train 則少了reco2file_and_channel 且wav.scp沒有修改到

wc 比較 4 个 wav.scp

解說一下 87的 wav.scp有啥變化。若會使,閣解說其他ê檔案

leo424y commented 6 years ago

根據kaldi doc 自己要準備的有 reco2file_and_channel segments text utt2spk wav.scp

比較 87 tarin_guan5在speed_3way處理成train後皆變為3倍,多出加減速的data 比較 129 tarin_guan5在encode處理成train後皆沒變化,已改為重試

重試結果 https://jenkins.iis.sinica.edu.tw/job/siann1-hak8_boo5-hing5/133/console

噴錯在steps/make_mfcc.sh [info]: segments file exists: using that.

run.pl: 21 / 32 failed, log is in data/mfcc_log/train/make_mfcc_train.*.log

會再進docker跑看log

utt2spk 提升至3倍

root@3c8e5306f010:/usr/local/kaldi/egs/taiwanese/s5c/data# diff train_guan5/utt2spk train/utt2spk | wc
 233485  700453 17277837
root@3c8e5306f010:/usr/local/kaldi/egs/taiwanese/s5c/data# diff train_guan5/utt2spk train/utt2spk | tail
> sp1.1-0000599TW02M1P0277-tong0116732-ku0000000 sp1.1-0000599TW02M1P0277
> sp1.1-0000599TW02M1P0277-tong0116733-ku0000000 sp1.1-0000599TW02M1P0277
> sp1.1-0000599TW02M1P0277-tong0116734-ku0000000 sp1.1-0000599TW02M1P0277
> sp1.1-0000599TW02M1P0277-tong0116735-ku0000000 sp1.1-0000599TW02M1P0277
> sp1.1-0000599TW02M1P0277-tong0116736-ku0000000 sp1.1-0000599TW02M1P0277
> sp1.1-0000599TW02M1P0277-tong0116737-ku0000000 sp1.1-0000599TW02M1P0277
> sp1.1-0000599TW02M1P0277-tong0116738-ku0000000 sp1.1-0000599TW02M1P0277
> sp1.1-0000599TW02M1P0277-tong0116739-ku0000000 sp1.1-0000599TW02M1P0277
> sp1.1-0000599TW02M1P0277-tong0116740-ku0000000 sp1.1-0000599TW02M1P0277
> sp1.1-0000599TW02M1P0277-tong0116741-ku0000000 sp1.1-0000599TW02M1P0277

utt2dur 提升至3倍

root@3c8e5306f010:/usr/local/kaldi/egs/taiwanese/s5c/data# diff train_guan5/utt2dur train/utt2dur | wc
 233485  700453 13344932
root@3c8e5306f010:/usr/local/kaldi/egs/taiwanese/s5c/data# diff train_guan5/utt2dur train/utt2dur | tail
> sp1.1-0000599TW02M1P0277-tong0116732-ku0000000 0.96
> sp1.1-0000599TW02M1P0277-tong0116733-ku0000000 1.16364
> sp1.1-0000599TW02M1P0277-tong0116734-ku0000000 1.14909
> sp1.1-0000599TW02M1P0277-tong0116735-ku0000000 0.829091
> sp1.1-0000599TW02M1P0277-tong0116736-ku0000000 1.07636
> sp1.1-0000599TW02M1P0277-tong0116737-ku0000000 0.814545
> sp1.1-0000599TW02M1P0277-tong0116738-ku0000000 1.01818
> sp1.1-0000599TW02M1P0277-tong0116739-ku0000000 1.01818
> sp1.1-0000599TW02M1P0277-tong0116740-ku0000000 1.01818
> sp1.1-0000599TW02M1P0277-tong0116741-ku0000000 1.03273

text 提升至3倍

root@3c8e5306f010:/usr/local/kaldi/egs/taiwanese/s5c/data# diff train_guan5/text train/text | wc
 233485 1032519 17933369
root@3c8e5306f010:/usr/local/kaldi/egs/taiwanese/s5c/data# diff train_guan5/text train/text | tail
> sp1.1-0000599TW02M1P0277-tong0116732-ku0000000 逐|tak10 下|e3 早|tsai1 仔|a2
> sp1.1-0000599TW02M1P0277-tong0116733-ku0000000 莫|bok10 名|bing7 其|ki7 妙|miau7
> sp1.1-0000599TW02M1P0277-tong0116734-ku0000000 第|te3 十|tsap10 二|ji3 名|mia5
> sp1.1-0000599TW02M1P0277-tong0116735-ku0000000 移|i7 花|hua7 接|tsiap8 木|bok8
> sp1.1-0000599TW02M1P0277-tong0116736-ku0000000 推|thui7 展|tian1 協|hiap10 會|hue7
> sp1.1-0000599TW02M1P0277-tong0116737-ku0000000 動|tang3 腦|nau1 動|tang3 嘴|tshui3
> sp1.1-0000599TW02M1P0277-tong0116738-ku0000000 做|tso2 做|tso2 了|liau1 後|au7
> sp1.1-0000599TW02M1P0277-tong0116739-ku0000000 高|ko7 速|sok8 鐵|SPN 路|loo7
> sp1.1-0000599TW02M1P0277-tong0116740-ku0000000 高|ko7 深|tshim7 莫|bok10 測|tshiat4
> sp1.1-0000599TW02M1P0277-tong0116741-ku0000000 烏|oo7 頭|thau7 仔|a1 車|tshia1

segment 提升至3倍

root@3c8e5306f010:/usr/local/kaldi/egs/taiwanese/s5c/data# diff train_guan5/segments train/segments | wc
 233485 1167421 17978289
root@3c8e5306f010:/usr/local/kaldi/egs/taiwanese/s5c/data# diff train_guan5/segments train/segments | tail
> sp1.1-0000599TW02M1P0277-tong0116732-ku0000000 sp1.1-tong0116732 0.00 0.96
> sp1.1-0000599TW02M1P0277-tong0116733-ku0000000 sp1.1-tong0116733 0.00 1.16
> sp1.1-0000599TW02M1P0277-tong0116734-ku0000000 sp1.1-tong0116734 0.00 1.15
> sp1.1-0000599TW02M1P0277-tong0116735-ku0000000 sp1.1-tong0116735 0.00 0.83
> sp1.1-0000599TW02M1P0277-tong0116736-ku0000000 sp1.1-tong0116736 0.00 1.08
> sp1.1-0000599TW02M1P0277-tong0116737-ku0000000 sp1.1-tong0116737 0.00 0.81
> sp1.1-0000599TW02M1P0277-tong0116738-ku0000000 sp1.1-tong0116738 0.00 1.02
> sp1.1-0000599TW02M1P0277-tong0116739-ku0000000 sp1.1-tong0116739 0.00 1.02
> sp1.1-0000599TW02M1P0277-tong0116740-ku0000000 sp1.1-tong0116740 0.00 1.02
> sp1.1-0000599TW02M1P0277-tong0116741-ku0000000 sp1.1-tong0116741 0.00 1.03

reco2file_and_channel 提升至3倍

root@3c8e5306f010:/usr/local/kaldi/egs/taiwanese/s5c/data# diff train_guan5/reco2file_and_channel train/reco2file_and_channel | tail 
> sp1.1-tong0116737 tong0116737 A
> sp1.1-tong0116738 tong0116738 A
> sp1.1-tong0116739 tong0116739 A
> sp1.1-tong0116740 tong0116740 A
> sp1.1-tong0116741 tong0116741 A
root@3c8e5306f010:/usr/local/kaldi/egs/taiwanese/s5c/data# diff train_guan5/reco2file_and_channel train/reco2file_and_channel | wc
 233485  933937 7938467

wav.scp 87有3倍,129沒有,推測script沒成功處理 87

root@3c8e5306f010:/usr/local/kaldi/egs/taiwanese/s5c# wc data/train_guan5/wav.scp
  116742  1634388 13613110 data/train_guan5/wav.scp
root@3c8e5306f010:/usr/local/kaldi/egs/taiwanese/s5c# wc data/train/wav.scp
  350226  7238004 49945206 data/train/wav.scp

129

root@0168bc89e10e:/usr/local/kaldi/egs/taiwanese/s5c# wc data/train_guan5/wav.scp
  116742  2685066 17232112 data/train_guan5/wav.scp
root@0168bc89e10e:/usr/local/kaldi/egs/taiwanese/s5c# wc data/train/wav.scp
  116742  2685066 17232112 data/train/wav.scp
sih4sing5hong5 commented 6 years ago

若你確定好--ah,我tsiah去docker看

leo424y commented 6 years ago

目前改的script確定可將wav.scp修改成我們要的可在139查看,140執行全部script中

ci@ci:~$ docker run -it localhost:5000/siann1-hak8_boo5-hing5:139 /bin/bash

root@54d0c8b2058b:/usr/local/kaldi/egs/taiwanese/s5c# wc data/train/wav.scp
  350226 13191846 72826638 data/train/wav.scp

root@54d0c8b2058b:/usr/local/kaldi/egs/taiwanese/s5c# head data/train/wav.scp
spalaw-tong0000000 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1000.wav -b 16 -c 1 -r 16k -t wav - | avconv -i - -f wav -ar 8000 - | avconv -i - -f alaw -ar 8000 - | avconv -f alaw -ar 8000 -i - -f wav -ar 8000 - |
spalaw-tong0000001 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1010.wav -b 16 -c 1 -r 16k -t wav - | avconv -i - -f wav -ar 8000 - | avconv -i - -f alaw -ar 8000 - | avconv -f alaw -ar 8000 -i - -f wav -ar 8000 - |
spalaw-tong0000002 sox -G /usr/local/pian7sik4_gi2liau7/TW01/M0/TW01M0P0000/tbw1020.wav -b 16 -c 1 -r 16k -t wav - | avconv -i - -f wav -ar 8000 - | avconv -i - -f alaw -ar 8000 - | avconv -f alaw -ar 8000 -i - -f wav -ar 8000 - |

...
leo424y commented 6 years ago

54